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ABSTRACT 

In hierarchical models of gravitational clustering, virialized haloes are biased tracers 
of the matter distribution. As discussed by Mo & White (1996), this bias is nonlinear 
and stochastic. They developed a model which allows one to write down analytic 
expressions for the mean of the bias relation, in the initial Lagrangian, and the evolved, 
Eulerian, spaces. We provide analytic expressions for the higher order moments as well. 

In the initial Lagrangian space, each halo occupies a volume that is proportional 
to its mass. Haloes cannot overlap initially, so this gives rise to volume exclusion effects 
which can have important consequences for the halo distribution, particularly on scales 
smaller than that of a typical halo. Our model allows one to include these volume 
exclusion effects explicitly when computing the mean and higher order statistics of 
the Lagrangian space halo distribution. As a result of dynamical evolution, the spatial 
distribution of haloes in the evolved Eulerian space is likely to be different from that in 
the initial Lagrangian space. When combined with the Mo & White spherical collapse 
model, the model developed here allows one to quantify the evolution of the mean 
and scatter of the bias relation. We also show how their approach can be extended 
to compute the evolution, not just of the haloes, but of the dark matter distribution 
itself. 

Biasing and its evolution depend on the initial power spectrum. Clustering from 
Poisson and white noise Gaussian initial conditions is treated in detail, since, in these 
cases, exact analytical results are available. We conjecture that these results can be 
easily extended to provide an approximate but accurate model for the biasing asso¬ 
ciated with clustering from more general Gaussian initial conditions. For all initial 
power spectra studied here, the model predictions for the Eulerian bias relation are in 
reasonable agreement with numerical simulations of hierarchical gravitational cluster¬ 
ing for haloes of a wide range of masses, whereas the predictions for the corresponding 
Lagrangian space quantities are accurate only for massive haloes. 

Key words: methods: analytical - galaxies: clusters: general - galaxies: formation - 
cosmology: theory - dark matter. 


1 INTRODUCTION 

In hierarchical models of gravitational clustering, it is pos¬ 
sible to use the statistical properties of the initial density 
field, assumed to be Gaussian, to compute good approxi¬ 
mations to the average number density of virialized objects 
at subsequent times (Press & Schechter 1974). In this pa¬ 
per, the number density of virialized objects will be called 
the unconstrained mass function. The statistical properties 
of the initial dark matter distribution can also be used to 
compute merger models which describe some aspects of how 
virialized haloes at a late time were assembled by mergers of 


smaller ones which, themselves, had virialized earlier (Bond 
et al. 1991). For example, the average number of Mi haloes 
identified at t\ that merged to form an Mo halo by time to 
can be computed (Lacey & Cole 1993, 1994). In this paper, 
this quantity will be called the constrained mass function. 
Associated with any given object is a merger history tree 
which describes how the object was assembled. An analytic 
model that describes the merger trees of dark matter haloes 
has only been developed for the special case of Poisson ini¬ 
tial conditions (Sheth 1996). With some care, it can also 
be used to describe the merger trees of haloes identified in 
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white noise initial conditions (Sheth & Pitman 1997; Sheth 
& Lemson 1998). 

In all these analyses, the number density of haloes was 
computed, but their spatial distribution was not. Recently, 
Mo & White (1996) described a model which uses the initial 
dark matter distribution to estimate the initial Lagrangian 
space distribution of dark matter haloes. Dynamical evolu¬ 
tion is likely to modify this distribution, so that the dis¬ 
tribution in the final Eulerian space is different from that 
initially. Mo & White also formulated a model for this evo¬ 
lution. In their model, statistical quantities in the Eule¬ 
rian space are obtained by transforming the corresponding 
Lagrangian space quantities appropriately. In their model, 
then, the problem is to compute the Lagrangian space quan¬ 
tities, since, once these are known, the corresponding Eule¬ 
rian quantities follow trivially. 

In the Mo & White model, haloes are biased tracers of 
the underlying matter distribution, the bias between haloes 
and mass being, in general, nonlinear and stochastic. They 
showed that, on average, the bias relation depends only on 
the constrained and unconstrained mass functions, but that 
knowledge of the higher order moments of the merger his¬ 
tory tree is required to compute the scatter around this mean 
correctly. Since they did not have an analytic model for the 
merger history tree, they were able to obtain analytic results 
for the scatter in the bias relation, or for the halo-halo cor¬ 
relation function, only in the limit of large separations. In 
this limit, the mean bias relation is linear, and the scatter 
around this relation is Poisson. 

Since a halo in the Lagrangian space occupies a volume 
that is proportional to its mass, and since haloes do not 
overlap, the Lagrangian space halo distribution is a partic¬ 
ular case of a hard-sphere model. As Mo & White discuss, 
the associated volume exclusion effects will introduce anti¬ 
correlations on scales smaller than that of a typical halo. On 
these scales, the scatter in the bias relation may well be less 
than Poisson. This paper combines some of the ideas con¬ 
tained in Mo & White (1996) with the analytic merger model 
of Sheth (1996) to provide a description of the evolution of 
the higher order moments of the halo distribution that in¬ 
corporates these exclusion effects explicitly. Thus, within the 
context of the Mo-White model, the results presented here 
are valid even on the small scales where the mean bias rela¬ 
tion is nonlinear. 

Although the analytic merger tree described by Sheth 
(1996) was derived for the special case of Poisson initial con¬ 
ditions, it also describes the trees associated with white noise 
Gaussian initial conditions (e.g. Sheth & Pitman 1997). 
Sheth & Lemson (1998) showed that it could be used to 
derive reasonably accurate analytic approximations to the 
higher order moments of the merger tree distribution asso¬ 
ciated with more general Gaussian initial conditions. When 
combined with the Mo & White model, this allows us to 
write down analytic approximations for the higher order mo¬ 
ments of, e.g. the bias relation, for more general Gaussian 
initial conditions, that should also be reasonably accurate. 

This paper is organized as follows. The Lagrangian 
space halo distribution associated with white noise initial 
conditions is described in Section ^ This section also serves 
to set notation. The white noise results are extended to de¬ 
scribe the Lagrangian space halo distribution in more gen¬ 
eral Gaussian random fields in Section Section ^ con¬ 


tains a brief summary of the Mo & White spherical col¬ 
lapse model for computing Eulerian space quantities given 
the corresponding Lagrangian ones. It also shows how the 
model can be extended to compute the Eulerian space prob¬ 
ability distribution function of the matter as well as the 
haloes. Section shows the results of comparing the model 
predictions with the distribution of haloes identified in nu¬ 
merical simulations of gravitational clustering. This section 
also compares the model predictions for the stochasticity of 
the bias relation with what is measured in the simulations. 
A final section summarizes our results. 

All the Lagrangian space results of this paper follow 
from results originally derived for haloes which form from 
Poisson initial conditions. Since these initial conditions are 
unfamiliar to most readers, the description of clustering from 
Poisson initial conditions is given in an Appendix. The Pois¬ 
son case has the virtue that everything can be worked out 
rigorously, so readers interested in the various subtle issues 
involved in this approach are encouraged to read it. 


2 WHITE-NOISE INITIAL CONDITIONS 


This section provides a description of the initial halo distri¬ 
bution when the initial matter distribution is a white-noise 


Gaussian random field. Sections 2.1-2.3 


summarize various 


known results. They are included to set notation, and to c lar¬ 


2.4 


ify the logic that leads to the final expressions. Section 
provides analytic expressions for the higher order moments 
of the Lagrangian space halo distribution. These moments 
are related to the higher order moments of the bias relation, 
and are the principal new results of this paper. 


2.1 Unconditional and conditional mass functions 

To set notation it is useful to summarize various known re¬ 
sults. Assume that the initial density field S is Gaussian, 
with power spectrum P{k). If the field is smoothed with a 
spherically symmetric filter of size V, then the smoothed 
field (5(U) is also Gaussian. This means that the one point 
probability distribution function is 

where S = ((5(U)^). That is, 

S=-— / 47veP{k)W^{kR) dk, (2) 

where W is the Fourier transform of the smoothing window, 
and V (X with the constant of proportionality depending 
on the shape of the window. In this section we will mainly be 
concerned with a window which is a top hat in real space, for 
which W{x) = (3/a;®)[sin(a;) — x cos(a:)], and V = duR?/i. 

Let p denote the average background density. If P{k) oc 
fc", then S oc (pU)~“, where a = {n + 3)/3. If n = 0 the 
random field is said to be white noise. The mass contained 
within the filter is M = pV{l-\-8). Notice that when S' <C 1, 
then |5| ^ 1 almost surely. In this case, 5 < — 1 is extremely 
unlikely, so there is no problem with defining the mass as 
was done above. 

We will assume that S ^ 1 in the initial conditions, 
which we will sometimes call the Lagrangian space. Then, in 
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Lagrangian space, |(5| ^ 1, so to lowest order in 5, M = pV, 
and S oc M~°‘. We will always be concerned with initial 
Gaussian fluctuation fields for which the relation between S 
and V, and so the relation between S and M is monotonic. 
Thus, in Lagrangian space, M, S and V are all equivalent 
variables. 

Most of the expressions associated with the excursion 
set approach concern properties of Gaussian random fields 
when they are smoothed on different scales. Here we will 
assume that the filter is a top hat in real space, and that 
the initial Lagrangian space distribution is Gaussian white 
noise. For white noise a = 1, so S = {pV)~^, and the condi¬ 
tional probability that the field has value 5i when smoothed 
on scale Vi, given that it had value 5o when smoothed on 
scale Vo is 


p((5i,Hi|5o,Ho) 


1 

v' 27 r(Si - So) 


exp 


(^1 - \ 
2(Si - So) J 


■ (3) 


Let q{5i, 5o, Vo) denote the probability that, when smoothed 
on scale Vo, the density is So, and that it is less dense than 
for all V > Vo. Then 


q{6i, (5o, Vo) = piSo, Vo) 


1 — exp I — 


2^i(5i — (5o) 
~So 


(4) 


provided > 5o, and it is equal to 0 otherwise (e.g. Ghan- 
drasekhar 1943). Of course, this means that q < p as ex¬ 
pected. 

In the excursion set approach, virialized dark matter 
haloes are associated with isolated regions: these are those 
Lagrangian regions that, when smoothed on some scale V 
are denser than some critical density, and when smoothed 
on still larger scales, are less dense than this (Bond et al. 
1991). All the mass contained within this critically over- 
dense isolated V is associated with a virialized halo. This 
required critical density is a function of time, but not of 
smoothing scale V. It decreases with increasing time: haloes 
that virialize at late times are associated with less dense 
isolated regions in Lagrangian space than haloes which viri¬ 
alize at early times. Let Sc{z) denote this critical density, 
and let f{M, S^) AM denote the fraction of Lagrangian space 
that is taken up by volumes V that have density Sc{z) when 
smoothed on scale V, and are less dense on all larger scales, 
so that each such isolated V is associated with a halo of mass 
M that has just virialized at the epoch labelled by 2 . In La¬ 
grangian space S, the mass M, and the associated volume 
V are all equivalent variables, so f{M, S^) AM = f{S, Sc) AS, 
and 

/(S,i.)dS=^^exp(-|)dS (5) 

(Bond et al. 1991). The associated number density of such 
isolated regions is the same as the number density of virial¬ 
ized objects, and is given by 

n(M, Sc) dM = £: f{S, Sc) AS = p . (6) 

This is sometimes called the unconstrained, or universal 
mass function (Press & Schechter 1974). Now, since S and M 
are equivalent variables, the integral of f{S, S) over all S is 
the same as the integral of f{M, S) over all M. Equation (^ 
shows that this integral is unity. This can be interpretted as 
showing that associated with any given epoch 2 is a parti¬ 


tion of the total Lagrangian volume into isolated regions of 
volume V and overdensity (5c( 2 ); the mass in each region V 
hrst virializes to form a halo of mass M = pV at 2 . 

Now consider some > So, where (5i is a convenient 
notation for (5c ( 21 ), and we have assumed that 21 > 20 , so 
2 increases with decreasing epoch. Restrict attention to La¬ 
grangian regions Vo that are associated with Mo haloes at 
the epoch 20 ; i.e., isolated regions Vo- Gonsider one such iso¬ 
lated region. Suppose that when smoothed on the scale Vi < 
Vo this region is denser than , and that it is less dense than 
this for all larger smoothing scales. Then Vi is an isolated 
subregion within Vo; this isolated Lagrangian subregion Vi 
within Vo can be associated with a subhalo Mi of Mo; Mi 
will first virialize at the epoch 21 . Let /(Mi|Mo) dMi denote 
the fraction of the mass of Mo that, at the epoch 21 , is as¬ 
sociated with subclumps Mi. Since S and M are equivalent 
variables, /(Mi|Mo)dMi = f{Si\So) ASi where 


f{Si,Si\So,So)ASi 


1 (^1 - ^ o ) 

-/Itt (Si — 5'o)®/^ 


X exp 


((5i - .5o)^ 
2(51 - So) 


dSi 


(7) 


(Bond et al. 1991; Lacey & Cole 1993). Integrating this over 
the range 0 < Mi < Mo gives unity: all the mass of Mo 
was in subclumps of some smaller mass at the earlier epoch 
2 i > 2o. This fraction can be converted into a mean number 
of Ml haloes within an Mo halo: 

A7(Mi,5i|Mo,(5o) = ^ f{Mi,Si\Mo,So). (8) 

Since Mo = pVo, we should divide A/'(1|0) by Vo to ex¬ 
press it as a number density. Then comparison with equa¬ 
tion (^ shows why this expression is sometimes called the 
constrained mass function. Equation (^ can also be under¬ 
stood as follows. For any given 21 > 20 , the mass Mo con¬ 
tained within an isolated Lagrangian region Vo within which 
the average density is So, so the region hrst virializes at 20 , 
can be thought of as being partitioned into isolated subre¬ 
gions, each of slightly higher density 5i. 


2.2 The first moment of the Lagrangian space 
halo distribution 

The previous expressions mean that the mean number of 
(Ml, 5i) haloes that are in randomly placed Lagrangian cells 
of size Vo is n(Mi,Si) Vo- Let N(Mi,SijSo,Vo) denote the 
average number of (Mi, (5i) haloes in a Lagrangian cell Vo 
that has overdensity 5o. Then, by dehnition, 

/ OO 

N(MuSilSo,Vo) p(So,Vo) ASo. (9) 

•00 

Since mass and volume are equivalent variables, we will as¬ 
sume that Af(l|0) = 0 if Mi > Mq. Below, we show that 
when Ml < Mo, then iV(l|0) is related to A/'(1|0), and that 
equation (^ is consistent with the results of the previous 
subsection. 

Classify all cells Vo by the overdensity within them. 
Each cell with density So is either isolated or not. By defi¬ 
nition, cells with (5o > 5i are not isolated. For cells that are 
not isolated, iV(l|0) = 0. Since there is no contribution from 
cells that are not isolated, to compute the average number 
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of {Ml, Si) haloes, we now need to sum up the contribution 
from cells that are isolated. 

If isolated, a cell can be partitioned into isolated sub- 
regions that are identified with haloes. Label each such 
partition m, where m lists the mass associated with each 
subregion. Let n{m) denote the set of all such partitions, 
and let p(m|Mo) denote the probability of having the par¬ 
ticular partition m (we have not written explicilty that this 
probability will also depend on hi and ho). We must inte¬ 
grate over all partitions m of {Mo,So) haloes, sum up the 
number n(Mi|m,Mo) of {Mi,Si) haloes in each partition, 
weight by the probability p{m\Mo) that that partition oc- 
cured, and then integrate over all values of ho, weighting by 
the probability that Vo with density ho is isolated. The sum 
over partitions gives the average number of equation (^): 

W(1|0)=/ n(Mi|m,Mo) p(m|Mo) = Af(l|0), (10) 

«/ 7r{m) 

where the final equality follows because the integral is over 
all partitions m of M, so it is the definition of A/'(1|0). This 
means that 

n{Mi,Si)Vo= / ' Af(l|0) ?(hi,ho,Vo) dho, (11) 

J —oo 

where the fact that only isolated cells give a nonzero con¬ 
tribution to the integral in equation (^ means that the up¬ 
per limit in the integral over ho must be hi, and that we 
must replace p(ho, To) with g(hi, ho, Vo), the fraction of cells 
of density ho that are isolated. Simple algebra shows that 
equations (^) and (^, when substituted into the right hand 
side of this expression, do satisfy this relation. 

The main reason for writing this out explicitly is that it 
shows how one might begin to quantify the extent to which 
virialized haloes are biased tracers of the underlying matter 
distribution. We do this in the next section. 


2.3 The mean bias relation and the cross 
correlation between haloes and mass 


Let 

A^(1|0) 


n(Mi|m,Mo) 
n{Mi,Si)Vo ~ 


( 12 ) 


denote the average overdensity of Mi haloes within an Mo 
halo that is known to be partitioned into the haloes m. 
Integrating this over all partitions gives 


5h(l|0) 



Ar„(l|0)p(m|Mo) 


Ar{Mi,Si\Mo,So) 

n{MiSi)Vo 


(13) 


This gives the mean overdensity of {Mi, 5i) haloes that are 
within {Mo, So) haloes. It can also be understood as the 
mean overdensity of isolated (Mi, hi) regions that are within 
isolated {Mo, So) regions in the Lagrangian space. In regions 
that are not isolated, (e.g., if Sq > hi) h^ = —1. Thus, 
h(((l|0) is the same as the mean bias relation of equation (12) 
in Mo & White (1996). The peak background split (their 
equation 13) is obtained in the limit in which the cell size 
Vo is much larger than the Lagrangian size of an Mi halo 
(e.g. Bardeen et al. 1986), 


hJKllO)^ ho = B(l|0)ho, (14) 

hi 

where v\ = S\lSi, and the final equality defines i3(l|0). 

Notice that the mean overdensity of the halo distribu¬ 
tion is a linear function of the mass overdensity only in the 
limit of equation (^^ . Equation shows that, in general, 
this mean bias relation is nonlinear. Just as the mean bias re¬ 
lation depends on the mean number of haloes in Lagrangian 
cells (Mo, ho), the higher order moments of the Lagrangian 
bias relation depend on the higher order moments of the 
Lagrangian space halo distribution. We will compute these 
higher order moments in the next subsection. If these higher 
order moments are nonzero, then there will be some scatter 
around this mean bias relation: in addition to being nonlin¬ 
ear, the bias will be stochastic. 

Before doing so, we will first calculate the Lagrangian 
space cross correlation between haloes and mass, averaged 
over all randomly placed Lagrangian cells Vo- This is 


^l,,(Mi,hi|Eo) 


/[ ^(1|0) 
\\^n(Mi,hi)Vb 



ho 


r A^(iio) 

jn{Mi,Si)Vo 


ho p(ho, Vo) dho. 


(15) 


In the first line, the integral is over all Lagrangian cells, 
so the second equality follows since (ho) = 0. This integral 
is the sum of two terms, the first due to those Lagrangian 
cells that are isolated, and the second due to those that are 
not. However, iV(l|0) = 0 for cells that are not isolated. 
For isolated cells, the contribution is computed by a dou¬ 
ble average, one over all values of ho with the substitution 
p{So) —> q{So), and the other over all partitions of m. The 
integral over partitions gives iV(l|0) = A/'(1|0), so 

5l,,(Mi,hi|Ho) = [ ' ^0 q{Si,So,Vo) dho. (16) 

J_oo n{Mi,Si)Vo 


This expression for the cross correlation between haloes and 
mass is the same as equation (15) in Mo & White (1996), 
but with a difference in interpretation. As we have shown, 
the average is to be understood as being over all randomly 
placed Lagrangian cells Vo, not just those that are less dense 
than hi. 

The integral in can be done analytically: 


Cta(i|o) 

Si 

(*^10 + 1) J 1 


/ 21/^0 e *" 10 /^ 

So 

So 

hi 1 

yV2) 

V TT hi ’ 

where 

2 

z/io — 

si {Si - So) 
SoSi ■ 


(17) 


When S'o ^ 1, then the error function tends to unity and 
the third term tends to zero. Thus, 


C^m(l|0) 

So 


1 

Ti 



5(110). 


(18) 


This is consistent with using equation (^^ for hj;'(l|0) in 
equation di) and substituting in ®- 
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2.4 Higher order moments of the bias relation 
and halo halo correlations 


Suppose that there are n Mi haloes within an Mo halo. The 
Lagrangian volume associated with these haloes isnVi. The 
average overdensity of the remaining volume is 


l + = MynMi 5 ( 0 ) ^ 

Vo - nVi 

Since Mi = Vi{l + 5i), 


5i - = (5i - ^o) 


Mq 

Mo — nMi 


(19) 


( 20 ) 


to lowest order in the S terms. With this definition, the ith 
factorial moment is 


i-l 

<t>i{Mi,5i\Mo,So) =Y[Ar(^Mi,5i\Mo-nMi,5^"^y (21) 

n=0 


Ei{Mi,Si,Vo) has a self-similar form; for haloes defined 
at a given 5i, it depends only on the cell size relative to 
the size of typical objects with the same tii, and on the 
size of the objects being measured relative to the cell size. 
Let ^h(ll|0) = H 2 (Mi, 5i, Vb). Note that this means that 
^j,(ll|0) denotes the volume average of the halo-halo cor¬ 
relation function. It is related to itself by the relation 

o r^o 

Chh(ll|0) = ^ J Chh()“) dr. 

This volume average is the variance of halo counts in La¬ 
grangian cells of size Vo divided by the square of the mean 
number of halo counts, minus the shotnoise contribution, 
1/Mo which accounts for the fact that the haloes are dis¬ 
crete objects. Equation (^) implies that 

1 + Clh(ll|0) = ^ [y(;)e-“ + [w + 0.5]7(0.5,a;)] . (24) 

v TT 


provided iMi < Mo, and it is zero otherwise. This formula is 
essentially a reworking of results originally in Sheth (1996). 
See Appendix ^ of this paper or Sheth & Lemson (1998) for 
details. Following the same logic as for the mean (the case 
i = 1), the ith factorial moment of the corresponding halo 
counts in cells distribution is 

(l)i{Mi,Si\Mo,So) q{Si,6o,Vo) d5o 

- OO 

= (^n(Mi,5i)yoy(l-f-Hi(Mi,5i,V'o)), (22) 

where the final equality defines Hi (Mi, 5i, Vb). If the scatter 
of halo counts were Poisson, then Hi = 0. For i > 1, equa¬ 
tion (^^ can be solved analytically. For example, when i is 
even, then it reduces to a sum of incomplete Gamma func¬ 
tions. Thus, it is possible to show explicitly that the scatter 
is not Poisson. 

Recall that the scatter in the bias relation is related 
to the higher order moments of the halo distribution. For 
example, the variance in the bias relation is essentially the 
same as the variance in the halo distribution. In general, 
this variance is neither zero, nor is it the same as the mean. 
In other words, the mean bias is nonlinear, it is stochastic, 
and the rms scatter around the mean is not the canonical 
square-root-of-the-mean value that is typical of a Poisson 
distribution. To see why, we turn now to a more detailed 
study of the halo halo correlation functions. 

Define 


l^rem 



where i/rem = 1 — 


S{Mo) 

S{iMi) 



So’ 


(23) 


with So = S{Mo). These two parameters have simple phys¬ 
ical interpretations. An (Mi,5i)-halo occupies a volume Vi 
in the initial Lagrangian space, and, by assumption, all the 
mass within Vi is associated with Mi. That is, haloes are 
spatially exclusive; they do not overlap with other. If a ran¬ 
domly placed Vo contains i haloes, each of initial size Vi, 
then j/rem is related to the fraction of Vo that is not occu¬ 
pied by these haloes. Since Mq oc Vq, Sq expresses the cell 
size Vo in units of the (Lagrangian) size of typical haloes at 
time (5i, since the usual definition of a typical M* halo is 
that S^/S, = (iV'S'(M.) = 1. 

For white noise. Hi (Mi, 5i, Vo) is not a function of Mi, 
(5i, and Vo, individually, but only of r'rem and sq. Thus, 


If c(Mi, M 2 ,5i I Mo , So ) denotes the cross correlation be¬ 
tween Ml and M 2 haloes, each with initial overdensity di, 
that are both within the same Mq halo of initial overdensity 
So, then the same logic that led to equation (§) implies that 

c(Mi, M 2 ,5i|Mo, 5o) = 

JViMi,Si\Mo,So) A/'(M2,5i|Mo -Mi,5«), (25) 

where was defined earlier (equation The volume av¬ 
eraged cross correlation function is got by averaging c(12|0) 
over all isolated volumes Vo'- 


1+Clh(12|0) = 



c(Mi, M 2 , Si\Mo, (5o) 
n{Mi,Si)Vo n{M2,5i)Vo 


q{Si, (5o, Vo) dbo- 


(26) 


Thus, ^h(^|0) is given by an expression that is exactly like 
equation (|24|), except that now r-rem = (512 — So)/Si 2 , with 
512 = S{Mi -\- M 2 ). For white noise, the actual values of ^i 
and S 2 are unimportant, only 5'i2 matters; given M < Mq, 
^i,(Mi, M—Ml, 5i|Vb) is the same for all values of Mi < M. 
This suggests that when ^ij(12|0) differs from zero, it is 
because of volume exclusion effects only. 

Figure ^ shows ^h(ll|0) s-s a function of cell size Vo for 
white noise initial conditions. The different curves show a 
range of choices of the halo mass mi. Masses and scales are 
in units of the characteristic mass M* and scale 14 = M* / p, 
respectively. For white noise initial conditions, this is also a 
plot of the average cross-correlation between haloes whose 
mass sums to 2mi. 

The shapes of these curves are easily understood. Con¬ 
sider haloes that have the same mass M. Given this mass, 
there are three scales in the problem: the Lagrangian scale 
of each halo, V, the initial mean separation between such 
haloes, R, and the Lagrangian scale associated with a typi¬ 
cal M* halo, 14 . Let m = M/M*, v = V/V,, W <x B? and 
w = IT/ 14 . Equation (^ shows that the number density of 
less massive (m <C 1) haloes is oc m“®^^/14. The mean sepa¬ 
ration volume W is the inverse of this, so w < 1. The number 
density of massive haloes (m ^ 1) decreases exponentially. 
For these haloes w > 1. 

Now, by definition all haloes are anticorrelated on scales 
smaller than that which they occupy (since it takes two 
haloes to make a pair, this scale is 2V). Massive haloes have 
w > 1. Since the mean separation between such haloes is 
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v^v. 


Figure 1. The volume average of the halo-halo correlation func¬ 
tion, given by equation (^), as a function of cell size 

Vo, for white noise initial conditions. The different curves are for 
haloes of mass mi = 1/64,1/16, 1/4,1,4 and 16, respectively. For 
white noise initial conditions, this is also a plot of the average 
cross-correlation between haloes whose mass sums to 2 mi. 


large, they are not affected by the fact that some of the vol¬ 
ume is excluded. Suppose that, on scales larger than 2V, 
these haloes were uncorrelated with each other. Then = 
— 1 on small scales, and = 0 on larger scales, so that 
on large scales the volume average is ^h(’^l^o) oc —2VIVo. 
This gives approximately the same qualitative behaviour as 
the limiting relation (CT). Namely, is always negative, 
and it becomes less negative with increasing scale Vq. 

Less massive haloes have w < 1. These haloes are af¬ 
fected by the excluded volume, since a large fraction of the 
volume they could have occupied is now excluded. This 
means that they must all be crowded into the remaining 
volume, so over a range of scales, they will appear to be 
correlated with each other. Thus, for white noise initial con¬ 
ditions, volume exclusion produces two effects. Firstly, all 
haloes are anti-correlated on scales smaller than that which 
they occupy. Secondly, less massive haloes are positively cor¬ 
related on intermediate scales, whereas more massive haloes 
are essentially uncorrelated on all scales larger than those 
which they occupy. Thus, on small scales, and for less mas¬ 
sive haloes, volume exclusion gives rise to effects which are 
in the opposite sense to the commonly held view that less 
massive haloes are also less correlated. 


2.5 The large-volume limit 


Before moving on to consider more general initial conditions 
than white-noise, it is useful to write down the large scale 
limits of the Lagrangian space halo-halo correlation func¬ 
tions. 

When Mo S> (Mi -I- M 2 ) and 5o < 5i, then use of the 
asymptotic expansion of the error function reduces equa¬ 
tion to 


?lh(12|0) 


B(1|0)B(2|0) 




So, 


(27) 


where U 2 and i?(2|0) are defined similarly to vf and i3(l|0) 
(cf. equation 0 . Since the factor {i'ih' 2 /Si)^ is not necessar¬ 


ily small, this limiting form shows that volume exclusion ef¬ 
fects are important, even on large scales, for massive haloes. 
In fact, in this limit ^ij(12|0)/S'o —> {1 — uf — zz|)/(5^, so 
massive haloes are less clustered than less massive haloes on 
all scales. 

For n > 2, define 

Hrt = -1^, where ^2 = Chh, (28) 

S2 


and denotes the volume average of the n-point La¬ 
grangian space correlation function of haloes that have the 
same mass. It is usual to use Sn to denote the correspond¬ 
ing ratios of the mass correlation functions; for a Gaus¬ 
sian random field, Sn = 0. In the large volume limit, 
^2 = (5o/(^ (1 — 2v^), where zz is related to the halo mass 
(equation In this limit, equation (^^ with the asymp¬ 
totic expansion of the error function yields 


Ho 


Hi 


Ho 


9iy^ - 1) 

(l-2z/2)2 

4zz2 (-3 + 24iz2 - 16 iz^) 

(1 - 2zz2)3 

125;/'* (3 - lOz/^ -f 5i/*) 

(1 - 2z/2)4 ■ 


(29) 


For massive haloes in this large cell limit 

Hn —> (n/2)"~* when z/ ^ 1. (30) 


These values are smaller than those associated with high 
peaks in a Gaussian random field, for which Hn — 

This is a consequence of volume exclusion. (For volume ex¬ 
clusion effects associated with peaks, see Coles 1986 and 
Lumsden, Heavens & Peacock 1989.) 

It is interesting that these values are just those associ¬ 
ated with the Poisson limit of the Generalized Poisson distri¬ 
bution (see, e.g., Saslaw & Sheth 1993). Thus, equation ( |30| ) 
shows that, when smoothed on large scales, the Lagrangian 
space distribution of massive haloes is Poisson. 


3 GENERIC GAUSSIAN INITIAL 

CONDITIONS 

Section ^ provided expressions for the constrained and un¬ 
constrained halo mass functions, and for the moments of the 
halo counts in cells distribution, for the special case of white 
noise initial conditions. It is known that for more general 
Gaussian initial conditions [i.e., the initial power spectrum 
differs from P{k) oc fc°], the constrained and unconstrained 
mass functions have the same form as the white noise func¬ 
tions, provided that all quantities are written in terms of 
the variance, defined by equation (|^. That is, the uncon¬ 
ditional and conditional mass functions for different initial 
power spectra differ only because the transformation from 
variance to mass depends on the initial power spectrum. For 
example, if P{k) oc fc", then S{M) oc oc , 

where we have used the additional fact that in Lagrangian 
space M and V are equivalent variables. Recall that white 
noise has n = 0, so in the previous section S oc 1/M. 

This section assumes that what works for the mass func¬ 
tions works for the counts in cells distributions also. That 
is, expressions for the moments of halo distribution, when 
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written in terms of the variance, are assumed to have the 
same form for all power spectra. There is no compelling rea¬ 
son why this should be so. For example, the form of equa¬ 
tion ( |2l[ ) follows from the mutual independence of discon¬ 
nected subvolumes. While this is a reasonable assumption 
for white noise initial conditions, it is almost certainly wrong 
for other power spectra. Nevertheless, the hope is that those 
correlations between neighbouring volumes which are ig¬ 
nored when using equation ( pl| ) to estimate halo-halo corre¬ 
lations will not make a crucial difference to the Hnal answer, 
for reasons discussed by Bower (1991). Moreover, Sheth & 
Lemson (1998) showed that this simple model for the higher 
order moments associated with the forest of merger history 
trees is in reasonably good agreement with the results of 
numerical simulations, even when the initial distribution is 
quite different from white noise. Since it is these same higher 
order moments that one uses to estimate halo-halo correla¬ 
tions, their results suggest that this simple model should be 
reasonably accurate here as well. 

Another way to see why this conjecture should be ac¬ 
curate is the following. The correlation function of haloes 
of two different masses is the product of the mean number 
of haloes of each of the two mass ranges times one plus the 
halo-halo correlation function. In principle, all three terms 
depend on power spectrum, although we only know this de¬ 
pendence for the two mean terms, and not for the correlation 
function. In the white-noise case, were it not for volume ex¬ 
clusion, this correlation term would be zero. For other initial 
power spectra, our conjecture means that we adjust the two 
mean terms correctly, and assume that most of the contri¬ 
bution to the correlation term comes from volume exclusion 
effects. This means that our conjecture does correctly ac¬ 
count for some, if not most, of the dependence of the cor¬ 
relation function and other higher order moments on the 
initial power spectrum. 

The integral (equation for the cross correlation be¬ 
tween haloes of mass Mi and M 2 , that one obtains by ignor¬ 
ing these correlations can be solved analytically. The final 
expression is lengthy, so we have not written it out below. 
In the limit of large cells, i.e., Mo ^ (Mi + M 2 ), 

^hh(12|0) —> i3(l|0) i3(2|0) 5'o -I- correction terms, (31) 

provided di > do- In general, the correction terms are not 
as simple as in the white noise case, so we have not written 
them down explicitly. 

In general, the full expression for halo-halo correlations 
differs from the white-noise expression in three signihcant 
ways. Firstly, A/’(2|10)A/’(1|0) = A/’(1|20)A/’(2|0) only when 
Mo is much greater than either Mi or M 2 . This implies 
that, in general, equation (|^) should be replaced with ei¬ 
ther c(2|10) = A/’(2|10)A/’(1|0), or c(l|20), where A/’(2|10) is 
understood as the average number of M 2 haloes within those 
Mo haloes that are known to have an Mi halo in their central 
volume element. The lack of spatial correlations for a white- 
noise spectrum meant that there, this restriction was irrele¬ 
vant. Here, however, this means that ^ij(12|0) and ^h(21|0) 
computed using equation ( p^ are no longer equivalent. 

Secondly, the halo-halo correlations depend on the 
masses of the haloes themselves, rather than just their sum. 
This suggests that volume exclusion effects are not the sole 
cause of halo correlations. Thirdly, provided S varies as some 
inverse power of scale, then, in the limit of large separations, 





Figure 2. The volume average of the halo-halo correlation func¬ 
tion, given by equation as a function of cell 

size Vq, when the initial power spectrum has slope n = — 1. 
The different curves are for haloes with mass mi = s-^ and 
SI = 24,22-66^ 21-33,1,2-1-33 and 2 - 2 - 66 ^ respectively. 





Figure 3. The volume average of the halo-halo correlation func¬ 
tion, given by equation (p 6 [), as a function of cell 

size Voi when the initial power spectrum has slope n = — 2 . 
The different curves are for haloes with mass mi = sj"®, and 
SI = 22-25,2'’ ’^®, 1 , and respectively. 

sufficiently high mass haloes are more correlated than low 
mass haloes. The correlation function of peaks in Gaussian 
random fields is known to depend exponentially on peak 
height (e.g., Bardeen et al. 1986; Jensen & Szalay 1986; 
Lumsden, Heavens & Peacock 1989; Regos & Szalay 1995). 
If high mass haloes correspond to high peaks in the ini¬ 
tial density held, then this result is qualitatively similar to 
that for peaks. The agreement with the peaks results is only 
qualitative. For example, just as in the white-noise case, the 
higher-order moments of the spatial distribution of massive 
haloes is different from that of high peaks. 

Figures and ^ show the volume average of the halo- 
halo correlation function (equation |^) as a function of scale, 
when the initial power spectrum has slope n = — 1 and 
n = —2, respectively. A range of choices of halo mass are 
shown. On scales smaller than 2 vi, volume exclusion effects 
mean that = — 1. As a result of halo exclusion effects. 
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haloes less massive than M* are positively correlated on 
intermediate scales, and on scales larger than about 4vi, 
^hj,(ll|0)/so ~ constant. On sufficiently large scales, haloes 
that are more massive than ~ M* are more correlated than 
less massive haloes. 


4 THE HALO DISTRIBUTION IN EULERIAN 
SPACE 


The previous sections showed how to quantify the difference 
between the halo and matter distributions in Lagrangian 
space. Dynamical evolution changes these distributions, so 
the bias between haloes and mass in Eulerian space is likely 
to be different from that initially. 

Mo & White (1996) argued that the bias relation in 
Eulerian space, i.e., the mean overdensity of (5i-haloes that 
are in spheres with comoving volume V which contain mass 
Mo at a, so they have Eulerian overdensity 

A = 1 + 5 = Mo/pV, (32) 


should be 


5h(l|0) 


A/'(Mi,5i|Mo,5o) 

n(Mi,Si)V 


(33) 


where ^(llO) is given by the (Lagrangian) equation (^, but 
with 


<50 pop 1-35 0.788 1.124 

l+Z A2/3 AO-587 ^1/2- 


(34) 


Therefore, in their model, expressions for the higher order 
moments of the bias relation in the Eulerian space can be 
obtained by transforming the corresponding Lagrangian ex¬ 
pressions similarly. We will use this fact below. 

Let p(Mo\V, z) dMo denote the probability that an Eu¬ 
lerian cell V contains mass in the range dMo of Mo at 2 . We 
will sometimes call this the Eulerian probability distribution 
function. Of course, p(Mo| V, z) dMo = p(^\V, z) dA and 


p(A|V', z) dA = / A p(A|U, 2 ) dA = 1. 


(35) 


Let N(Mi, Si\Mo, V, z) denote the average number of 
(Ml,(5i)-haloes in such a cell. Then the average number of 
haloes in Eulerian cells of size V is 


n(Mi,Si)V= / N(Mi,5i\Mo,V) p(Mo\V)dMo, (36) 
Jo 

where we have not bothered to write the dependence on 2 ex¬ 
plicitly. This is the analogue of the Lagrangian relation ( |ll| ) . 
Suppose we assume that 


N(Mi,Si\Mo,V,z)=M(Mi,5i\Mo,So), (37) 

where (5o is given by equation (|3^). That is, the average 
number of haloes in Eulerian cells of size V that contain 
mass Mo is assumed to be the same as the average number 
of haloes in Lagrangian cells Mo that, because they origi¬ 
nally had overdensity (5o(A), they have size V at 2 . Then 
equation ( |3^ implies that 

noo 

f(Mi,Si)= /(Mi,5i|Mo,<5o) Ap(Mo|U)dMo, (38) 

J Ml 

where 5o is given by equation (^|), and again, we have not 


written the 2 dependence explicitly. The lower limit of the 
integral has been set to Mi since, in the spherical collapse 
model which gives equation (p^, the Eulerian radius of a 
collapsed halo is zero. This means that if an Eulerian cell 
V contains an (Mi, 5i)-halo, then it must contain all of the 
halo’s mass, so it must have Mo > Mi. 

Equation ( ^jj ) is interesting for the following reason. 
The term on the left hand side, f(Mi,Si) is known. If the 
Eulerian cell size V is given, then f(Mi,Si\Mo,5o) is also 
known, for all Mq. Only the Eulerian probability distribu¬ 
tion p(Mo\V) is not known. Therefore, equation ( |38| ) is an 
integral equation of the hrst kind, so it can be solved nu¬ 
merically to yield p(Mo\V) dMo. 

That is, for any Eulerian cell size V, the assumption ( |3^ 
allows one to solve for the Eulerian probability distribution 
function that is associated with the spherical collapse model 
as parametrized by equation 034). Once p(Mo\V) is known, 
repeated use of the assumption (|37|) allows one to compute 


i^in(Mi,5i\V) = ( 5 ;:=( 1 | 0 ) 5 ^ 

<5h(l|0)<5p(Mo|U)dMo, (39) 

where (1 -I- <5) = Mo/pV. Notice that this resembles the 
Lagrangian relation (|l6|). Similarly, 



l + ^Th(Mi,M2,5i|U) = 


c(Mi,M2,5i|Mo,5o) 

70 n(Mi,5i)Un(M2,5i)U 


p(Mo|U)dMo, 


(40) 


where c(12|0) = ^(110)^(2110) is the Lagrangian rela¬ 
tion (^^, with So given by equation (js^. 

Our approach extends that of Mo & White (1996). They 
wrote down equations ( |3^ ) and (^), though they did not 
have an expression for c(12|0). However, they did not write 
down equation (^), so they did not know how to solve for 
the Eulerian p(Mo|U). Therefore, they assumed that they 
could use the one measured in their simulations. Strictly 
speaking, this is not permitted, since there is no guarantee 
that then equation (|36[) is satished, as it should. Indeed, if 
one substitutes the Lognormal distribution for p(Mo|U) (as 
Mo & White did) into this formula, then one finds that, 
in general, this normalization requirement is not satished 
(though Mo & White do not mention this). Nevertheless, if 
the spherical model is a good approximation to what actu¬ 
ally happens in the simulations, then there is some hope that 
using the actual p(Mo\V) distribution measured in the sim¬ 
ulations will, indeed, give the correct normalization. (Also 
see Sheth 1998 for more discussion of this point.) 

Below, when we compare our results with simulations, 
we will show that the Mo & White approach is reasonably 
well normalized on large scales. So, although we should first 
determine the Eulerian p(Mo|U) using the integral equa¬ 
tion (^), and then we should use it to compute 
self-consistently, in what follows, we will not. 

Mo & White mainly considered the case in which the 
time at which the haloes hrst virialized ai, and that when 
their spatial distribution was studied no, were the same. 
They also studied the spatial distribution of haloes at epochs 
later than those at which the haloes had virialized (oq > oi). 
In both these cases, the previous formulae are correct if 
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(5i = 1.68647 (ao/fli) and (5o is given by equation (^) with 
2 = 0. Thus, (5i > So is always satisfied. 

In principle, it should also be possible to use the spher¬ 
ical model to describe the distribution of the haloes at high 
redshift, prior to virialization. The spatial distribution at 
some early time Zi of haloes that will virialize at the present 
2 = 0, is described by the previous expressions, but with 
the appropriate value of 2 = 2 i in equation . This means 
that we need to know the Eulerian distribution function as a 
function of 2 . For example, for haloes that virialize at 2 = 0, 
= 1.68647, so it is possible that 5o > 5i. In the language 
of the previous sections, such Eulerian cells are not isolated. 
So, in principle, we need to be able to compute the prob¬ 
ability that an Eulerian cell is isolated. In general, this is 
difficult. Fortunately, things simplify when 2 1: in this 

limit 5 <C 1, most fluctuations are small (|5| <C 1), and the 
Eulerian distribution function tends to a Gaussian. So, in 
this limit, this procedure reduces to the Lagrangian descrip¬ 
tion of the previous sections. It is also reassuring that, in 
this limit, the spherical model expressions reduce to those 
expected using linear theory (Section 19 in Peebles 1980). 
We will use this fact below. 



6 6 

Figure 4. The Lagrangian space probability distribution func¬ 
tion p((5) as a function of overdensity 5. Each panel shows four 
choices of scale R/L = 0.02 (broadest curves), 0.4, 0.8 and 0.16 
(narrowest curves). Histograms show the distribution measured 
in the simulations; thin dashed curves show Gaussian distribu¬ 
tions, and thicker solid curves show Generalized Inverse Gaussian 
fitting functions (equation l44), that have the same variance. 


5 COMPARISON WITH SIMULATIONS 

This section shows the results of comparing the model pre¬ 
dictions obtained in the previous sections to the halo dis¬ 
tributions measured in numerical simulations of clustering. 
This is done in two steps. First, the theoretical bias rela¬ 
tion, (5h(l|0), and the scatter around this relation, are com¬ 
pared with those found in the simulations. Then the the¬ 
oretical halo-mass and halo-halo correlation functions are 
compared with those in the simulations, since these are es¬ 
sentially weighted integrals over the bias relations. We do 
this in Lagrangian space, and then in Eulerian space. 

The simulations used here are the same as those used by 
Mo & White (1996), where they are described in more detail. 
They follow the evolution of 10® identical particles in a cubic 
box with periodic boundary conditions. If the volume L® of 
the box, the mass m per particle, and the initial expansion 
factor a are all set to unity, then the simulations are nor¬ 
malized so that S{M) = initially, where n is the 

initial slope of the power spectrum. The characteristic mass 
Mt{a) at the expansion time a is given by 5(M*) = {Sc/a)^, 
for some Sc which is determined by fitting the unconditional 
mass function of equation (P) to the mass function of bound 
objects identified in the simulations. The group identifica¬ 
tion algorithm used here is the same friends-of-friends algo¬ 
rithm used by Mo & White, as are the methods for assigning 
Lagrangian and Eulerian positions to a group identified at 
any given time. As for the simulations studied by Lacey & 
Cole (1994), the mass function of bound objects in these 
simulations is fit, to within a factor of two or so, by equa¬ 
tion (|^ with Sc = 1.7. This value is used to compute all the 
theoretical curves shown below. 

The main complication in comparing the theory to sim¬ 
ulations is that of the finite mass resolution in the simu¬ 
lations. This means that, in practice, correlations between 
haloes are measured for a range of masses. This has an im¬ 
portant consequence, since now, the distribution of isolated 
regions is different from that of the centre-of-mass distri¬ 


bution of collapsed haloes (this is a subtle point that is 
discussed more fully in Section A6). This is unfortunate, 


since, to account for this fact, we must make some assump¬ 
tion about the nature of the Lagrangian space volume ele¬ 
ments associated with halo centres-of-mass. In the Poisson 


and white noise cases. Section A6 argued that we could sim¬ 
ply assume that this volume element is just a randomly cho¬ 
sen one of the volume elements of a halo. This assumption 
is almost certainly wrong if the initial distribution differs 
from wh ite noise. Nevertheless, for reasons discussed in Sec¬ 
tion A6, we will assume that this is, indeed, the case. 


This means that the mean bias relation is 


n{>m,S\)Vo 


(41) 


where 


n(>m,(5i)= / n(Mi,(5i) dMi, 

J m 

and 

poo 

J\r{>m,Si\Mo,So) = / Ar{Mi,Si\Mo,So) dMi, 


provided Mi < Mo and > 5o. In the Eulerian space. Mo 
and So are obtained from V and S as described in Section ^ 

This quantity depends only on the first moment of the 
subclump distribution. Although it could have been com¬ 
puted by Mo & White (1996), they did not show it. The 
scatter in this relation depends on the second order moment, 
so, although they were unable to compute it, we can. 

There are additional reasons why it is not entirely 
straightforward to compare the theory with simulations. For 
example, the average number density of haloes (the uncondi¬ 
tional mass function), and the average number of subhaloes 
within haloes (the conditional mass function) in the simula¬ 
tions are, typically, described by the theory only to within 
a factor of two or so. Also, on small scales in particular, 
the initial particle distribution in the simulations is not par- 
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ticularly Gaussian when the initial power on large scales is 
significant (see Fig. Since the bias relations are essentially 
the ratio of the conditional to the unconditional mass func¬ 
tions, they are sensitive to the first of these discrepancies. 
The integrals which define and are also sensitive to 
the shape of the initial probability distribution function, so 
they are sensitive to both these discrepancies. 

Finally, there is some uncertainly regarding how the ini¬ 
tial particle load in the simulations should be treated. This 
freedom arises because the initial particle distribution is not 
the true Lagrangian distribution, but a linearly evolved ver¬ 
sion of it. This means that, when comparing the Lagrangian 
theory with the simulations, we must account for the fact 
that cell sizes in the initial distribution are not the same as 
the associated Lagrangian size. Though they do not say so 
in their paper. Mo & White (1996) treated this problem as 
follows (private communication). They used equation ( |32| ) 
to rescale the size of each cell in the simulations, and then 
used this rescaled size in the denominator that defines <5^, 
but nowhere else. They then used this value of when 
averaging over all cells to determine what they called the 
Lagrangian 

We have chosen the following procedure. We treat the 
initial particle distribution no differently from any other out¬ 
put time in the simulations. This means that we plot the 
simulation results exactly as measured, with no rescaling. 
We then compare these to our theoretical Eulerian expres¬ 
sions, transformed according to the spherical model to the 
appropriate redshift. Recall that, in the limit of small ini¬ 
tial fluctuations, this is the same as using linear theory to 
make the necessary corrections (Section 19 of Peebles 1980). 
The complication is that, in this case, the associated p{5) 
distribution is no longer Gaussian, so the distribution cor¬ 
responding to q is no longer known. Nevertheless, if p{S) is 
sufficiently close to Gaussian, then using q should be a good 
approximation. We find that the Generalized Inverse Gaus¬ 
sian distributions (described in Section below) provide 
reasonable fits to the counts in cells distributions measured 
in the simulations for a wide range of scales and output 
times, so we use them for p{S). 


5.1 Biasing in Lagrangian space 

This subsection compares the bias relation between haloes 
and mass measured in the simulations in the Lagrangian 
space with the theoretical model developed in the previous 
sections. 

Figs. m show the bias relation for haloes containing 
more than m particles, identified in simulations with initial 
power spectra having slope n at an expansion factor a since 
the initial time, and for four representative choices of the 
spherical cell radius: R/L = 0.02, 0.04, 0.08, and 0.16. For 
each cell size, statistics were averaged over 27,000 spheri¬ 
cal cells. The histogram which rises from the bottom left 
to the top right of each panel shows the cumulative distri¬ 
bution function of the matter fluctuation (im. This curve is 
intended to show the range of 5m over which the simula¬ 
tions are able to provide a good test of the theory. The thin 
dashed line through each histogram shows the correspond¬ 
ing cumulative distribution for a Gaussian with the same 
variance; the thin solid line through each histogram shows 
the corresponding Generalized Inverse Gaussian. The large 
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Figure 5. The Lagrangian space bias relation for haloes which 
contain more than m = 32 particles that form from white noise 
initial conditions. Plot shows the mean overdensity of haloes 
5h(>m|F) as a function of the overdensity of mass 5m in spher¬ 
ical cells of radius R, as well as the scatter around the mean. 
Symbols show quantities measured in the simulations: large filled 
circles show the mean, smaller filled circles show the rms scat¬ 
ter, and open circles show the scatter if the halo counts were 
Poisson. Solid curves show the model predictions, dashed curves 
show the Poisson scatter corresponding to the theoretical mean. 
Haloes were identified at an expansion factor of a = 6.1; the bias 
relation was computed from the halo-centre-of-mass and mass dis¬ 
tributions at the initial time a = 1. The histograms that rise from 
left to right in each panel show the cumulative counts-in-cells dis¬ 
tribution. The simulations provide a good test of the theory only 
in the range where this cumulative curve is steep. 


filled circles show the mean bias relation measured in the 
simulations, smaller filled circles show the rms fluctuations 
around this mean, and the open circles show the expected 
Poisson fluctuation given the mean. In most cases, the rms 
fluctuations are smaller than the Poisson value; this shows 
that volume exclusion effects are important. The thickest 
solid curve shows the mean bias relation predicted by the 
model, the less thick solid curves show the theoretical rms 
fluctuation around this mean, and the dashed curves show 
the value if the fluctuations were Poisson. 

Fig. I is extremely encouraging. The theory is able to 
describe the mean bias relation, (5h|5m), as well as the scat¬ 
ter in this relation well, even when the scatter is less than 
Poisson (though, for haloes of this mass range, the difference 
from Poisson scatter is small). That is, the theory appears 
to describe the effects of volume exclusion on the halo dis¬ 
tribution well. Figs. ^ and ^ are intended to show that the 
theory must be used with some caution. These figures show 
the bias relation associated with haloes identified at a later 
time than those in Fig. Since M, = (a/Sc)^ for white 
noise, M, « 13 for the haloes in Fig. whereas M* ~ 470 
for the haloes in Figs. ^ and ^ At the later output time, the 
theory gets the mean of the Lagrangian bias relation wrong, 
although the scatter around the mean is still qualitatively 
correct, when the minimum mass m = 32. For haloes more 
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Figure 6. The same as the previous Figure, i.e., n = 0 and 
m = 32, but now a = 36.9. 
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Figure 7. The same as the previous Figure, i.e., n = 0 and 
a = 36.9, but now m = 256. 


massive than m = 256, however, the theory is accurate, in 
the mean, and for the scatter. 

Thns, these figures show that the theory is relatively 
accurate when describing the distribntion of haloes more 
massive than ~ M*, but not of less massive haloes. This 
suggests that the spherical model is a good description of 
the collapse of massive haloes, bnt that the formation and 
evolution of less massive haloes may be more complicated. 

Figs. ^ and 1^ show that the theory works even when the 
initial conditions are different from white noise. These fig¬ 
ures were constructed from haloes identified at an expansion 
factor a = 6.1 in a simulation in which the initial power spec- 
trnm had slope n = —1.5. So, for these fignres, M* « 163. 
Again, the bias relation associated with massive haloes is 
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Figure 8. The same as the previous Figure, but for n = —1.5, 
m = 32 and a = 6.07. 



Figure 9. The same as the previous Figure, i.e., n = —1.5 and 
a = 6.07 but now m = 256. 

well described by the theory (Fig. whereas that of the 
less massive haloes is not (Fig. 

Before concluding this snbsection, it is worth noting 
that the theoretical curves for the mean bias relation be¬ 
come increasingly different from the simulation resnlts as 
R decreases. Although the mean relation on these smaller 
scales is different, the predicted scatter around the mean 
shows the same qualitative behaviour as that measured. We 
have not shown curves for smaller R here, since on these 
smaller scales it is not clear how much of the discrepancy 
in the mean is due to limitations associated with the finite 
number of particles in the numerical simulations. 
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Figure 10. The volume average of the Lagrangian space halo- 
mass cross correlation function, fhm (>^|^)5 equation as a 

function of cell size i?, when the initial power spectrum has slope 
n, for haloes identified at a range of output times, labelled by the 
expansion factor a. Panels on the left show the result of computing 
the average by using only those cells whose initial density was 
less than 5c/a. Panels on the right show the result of averaging 
over all Lagrangian cells, whatever their density. Symbols show 
this quantity measured in the simulations; curves show the model 
predictions (made using the value of (5c shown). From bottom to 
top in each panel, the different curves are for haloes with m = 32 
(circles), 64 (triangles), 128 (squares), and 256 (stars) particles. 


5.2 Lagrangian space halo correlation functions 


The cross correlation between haloes and mass is essentially 
a weighted integral over the bias relations shown in the pre¬ 
vious subsection. In this sense, ^hm is a slightly less funda¬ 
mental quantity than ((5h|(5m). The cross correlation between 
haloes with mass larger than m, whose centres-of-mass are 
within a cell Vq, and the mass within that cell, is 


Chm(>m|0) = 




+ <5i 


n{>m,Si) 


Chm(l|0) dMl 


n{>m,5i) 


dMl 


(42) 


where n{Mi,Si) and Chm(l|0) were defined earlier, jj, = 
max(m, Mo), and the convention is that, in the first term, 
^hm(l|0) = 0 if Ml > Mq. The second term accounts for 
the difference between counting haloes i nstea d of isolated 
regions. This is the analogue of equation (A.67). In general. 


these integrals over the range of halo masses must be done 
numerically. 

Figure shows equation (^) for white noise initial 
conditions, for haloes identified at a range of output times, 
and minimum mass cutoffs, as a function of scale. The plots 
are for the Lagrangian space distribution of haloes identi¬ 
fied at the epoch a, and the four curves in each plot are 
(from bottom to top) for m = 32, 64,128, and 256 parti¬ 
cles, respectively. The figure actually shows ^hm/Cm, where 
= a^/M°‘, and S*(a) = <5^1 with Sc = 1.7 as required 
by the spherical model. The two panels show the difference 
between averaging over all Lagrangian cells (right) and av¬ 



0.01 0.1 0.01 0.1 
R/L R/L 

Figure 11. The same as the previous figure, but now n = —1.5. 
The theory describes the simulation results reasonably well for 
massive haloes, and rather poorly for less massive haloes, where 
massive and less massive are defined relative to M*(a). 


eraging only over those Lagrangian cells which are not too 
overdense (left). Thus, the panels on the left are the same 
quantity computed by Mo & White (1996). 

Typically, the fits in the panels on the left are better 
than those shown in the panels on the right, and, typically, 
the fit is usually better on larger than on smaller scales. (On 
large scales, the number of cells in the two panels is almost 
the same anyway.) This suggests that the way in which the 
model assigns haloes to Lagrangian cells that are not iso¬ 
lated is not quite correct. In the panels on the right, the 
model systematically underestimates 5hm(> m\V) on small 
scales. Comparison with Fig. ^ shows that the discrepancy 
increases as the initial power on large scales increases (n 
becomes more negative). This is not unexpected. The as¬ 
sumption that the centre-of-mass particle is a random one 
of a halo’s particles is likely to be less accurate as n becomes 
more negative. On the other hand, some of the discrepancy 
on small scales may be spurious. These are measurements in 
Lagrangian space, and the initial inter-particle spacing was 
on the order of R/L ~ 0.01, so it is not clear that differ¬ 
ences on these small scales are significant. Moreover, recall 
that when n = —1.5, then the initial particle distribution on 
small scales is far from Gaussian (Fig. 

Figs, p^andpr] appear to show that the theory describes 
the simulation results better for small values of the expan¬ 
sion factor a. This is a consequence of one of the results of 
the previous subsection; when the mass of a halo identified 
at time a is expressed in units of M,{a), then the theory 
describes the distribution of massive haloes better than less 
massive ones. At some small a, haloes with more than, say, 
64 particles are larger relative to an M* halo at that time, 
than they are at some later time. So, in Figs. [lo| and |ll| , the 
theory appears to work better at small a than large. 

Before considering the halo-halo correlation function 
we think it worth remarking that some of the agreement 
between theory and simulation is a consequence of showing 
the ratio rather than ^hm and themselves. On 

small scales ^ 1, so the ratio tends to zero. Had we 
shown ^hm only, then the theory and the simulation curves 


© 0000 RAS, MNRAS 000, 000-000 

















































Biasing and the distribution of dark matter haloes 13 



Figure 12. The volume average of the Lagrangian space halo- 
halo correlation function, ^hh(> equation (^), as a func¬ 

tion of cell size R, for the same haloes that were used to make 
Fig. Panels on the left show the result of computing the aver¬ 
age by using only those cells whose initial density was less than 
Sc/a- Panels on the right show the result of averaging over all La¬ 
grangian cells, whatever their density. Symbols show this quantity 
measured in the simulations; curves show the model predictions 
with the value of (5c shown. 


can look quite different, particularly on small scales. Again, 
this suggests that the theory should be used with caution. 

The correlation function between haloes with mass 
greater than m, averaged over Lagrangian cells of size Vb, is 


1 + ^hh(>m|0) 



n{Mi,Si)n{M2,3i) 

5i) 


X 


1+Chh(12|0) , 


(43) 


where 5hh(12|0) is given by equation and the conven¬ 

tion is that ^hh(12|0) = —1 if Mi + M2 > Mq. This is the 
analogue of Muation (A72). 

Figs. pl-U show that, as a result of volume exclusion 
effects, 5hh(>m|0) is likely to be negative for all except large 
values of Vb- Since halo correlations increase as n decreases, 
this effect will be weaker as n becomes more negative. Thus, 
when n ~ 0, then ^hh(> rn) will almost always be negative. 
Only when n ~ — 1 or so will it become positive, and then, 
only when m is large compared to M,[z). The distribution 
of haloes measured in the simulations show that this is true. 

Fig. § shows equation , for a range of output times 
and minimum mass cutoffs, as a function of scale. The plots 
are for the Lagrangian space distribution of the same haloes 
that were used to produce Fig. |^. Notice that more massive 
haloes are always less clustered than less massive haloes, in 
agreement with the white-noise result (equation 1^. This 
would not have been expected from the Mo & White (1996) 
formulae. Again, this suggests that our model for halo exclu¬ 
sion effects is reasonably accurate. Figure]^ shows that our 
model is also reasonably accurate when the initial conditions 
differ from white noise. 

There are, of course, some systematic differences. The 
theoretical curves fit the data in the panels on the left bet¬ 


ter than the data shown on the right, and the discrepancy 
is more obvious for n = —1.5 than for n = 0. This simply 
reflects the fact that our model, in which the centre-of-mass 
particle of a halo is a random one of its constituent particles, 
is not very realistic (though it is a better approximation in 
the white noise case). Also, on small scales, the simulation 
haloes are systematically less anti-correlated than the model 
predictions, suggesting that they are affected less strongly 
by volume exclusion effects than in the model. This is a 
consequence of at least two facts. The first is that, in the 
simulations, small haloes in particular are not necessarily 
spherical, so the excluded volume associated with them is 
not necessarily spherical. Thus, in the simulations, it is pos¬ 
sible for two centre-of-mass particles, associated with haloes 
of mass Ml and M 2 , to fall in the same spherical Lagrangian 
region Mq, even if Mi -\- M 2 > Mo, since not all their asso¬ 
ciated particles actually fall in Mq. In the model this never 
happens. The second is that, in fact, the number density of 
haloes described by the model (the denominator in equa¬ 
tion is, in general, only within a factor of two or so of 
the actual number density of haloes measured in the simula¬ 
tions. Since the halo-halo correlation function is normalized 
by the square of this number density, this relatively minor 
discrepancy may still be important. Finally, recall that when 
n = —1.5, then the initial distribution on small scales was 
not particularly Gaussian (Fig. |^. 


5.3 The Eulerian probability distribution function 

We argued (Section ^ that, in principle, the Mo & White 
model for transforming Lagrangian space statistics into Eu¬ 
lerian space ones can be used to derive the Eulerian space 
dark matter distribution function. To do so, we showed that 
one must solve the integral equation (|3^. However, not only 
must the resulting distribution be correctly normalized (to 
unity), but (A) = 1 as well (cf. equation ^). There is no 
guarantee that, in general, the solution to the integral equa¬ 
tion will meet both normalization conditions. Therefore, we 
have chosen to stick with the approach used by Mo & White 
(1996). Namely, when the Eulerian distribution function is 
required, we will simply use the one measured in the simu¬ 
lations, since it is guaranteed to satisfy (js^. Whenever we 
do so, we will also show the extent to which this is self- 
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Figure 14. The Eulerian space probability distribution func¬ 
tion p{5) as a function of overdensity 5, for clustering from 
white noise initial conditions. Each panel shows four choices of 
scale R/L = 0.02 (broadest curves), 0.4, 0.8 and 0.16 (narrow¬ 
est curves). Histograms show the distribution measured in the 
simulations; thicker, smoother curves show Generalized Inverse 
Gaussian distributions that have the same variance. 


consistent by showing the ratio of the left hand side to the 
right hand side of equation (^^. 

Figs. 0 and 1^ show the Eulerian space probability 
distribution function for a range of cell sizes. The histograms 
show thep(5) distribution measured in the simulations. Solid 
curves show Generalized Inverse Gaussian distributions (e.g. 
Sheth 1998) that have the same variance: 


p{d) d(5 


2Kx{co) 




(44) 


where s = 1/(1 + <5) Kx{uj) is a modihed Bessel func¬ 
tion of the third kind, and A = —3/[2(n -|- 3)], if the initial 
power spectrum had slope n. The parameter uj is related to 
the variance by the relation 

{{l + 5f) = l + ^rn^K3^{oj)/Kx{uj) (45) 


(since {5) = 0, and Kx = K-x). For the curves shown, the 
values of are as follows: when n — 0 and a = 6.1, then 
= 0.62, 0.1, 0.01, and 0.002 for R/L = 0.02, 0.04, 0.08 
and 0.16, respectively. When n = 0 and a = 37, then the 
corresponding values of have grown to 10.9, 2.1, 0.4 and 
0.07. When n = —1.5 and a = 6.07, then = 14, 3.6, 0.98, 
and 0.26. Thus, on small scales, the clustering is reasonably 
well evolved. The hgures show that the analytic formulae 
provide a reasonably good, but by no means perfect, fit to 
the simulation data on all scales. The fit appears better on 
a log scale than on a linear scale. Nevertheless, they will be 
used as convenient fitting functions to the Eu lerian space 


distributions when they are used in Section 5.6 


5.4 Biasing in Eulerian space 

This subsection compares the bias relation between haloes 
and mass measured in the simulations in the Eulerian space 
with the theoretical model developed in the previous sec¬ 
tions. The theoretical model combines the Lagrangian ex¬ 
pressions derived in Sections ^ and ^ with the Mo & White 
(1996) model of Eulerian evolution discussed in Section ^ 



(5 

Figure 15. Same as the previous figure, but for clustering from 
n = —1.5 initial conditions. 
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Figure 16. The Eulerian space bias relation for haloes containing 
more than m = 32 particles that form from white noise initial 
conditions. Symbols show quantities measured in the simulations: 
large filled circles show the mean, smaller filled circles show the 
rms scatter, and open circles show the scatter if the halo counts 
were Poisson. Curves show the model predictions. Haloes were 
identified at an expansion factor of a = 6.1; the bias relation was 
computed from the halo-centre-of-mass and mass distributions at 
that time. The histograms that rise from left to right in each panel 
show the cumulative counts-in-cells distribution. The simulations 
provide a good test of the theory only in the range where these 
curves are steep. The solid lines through the histograms show 
the cumulative Generalized Inverse Gaussian distribution fitting 
functions. 


However, it is independent of the Eulerian space dark matter 
distribution function. 

Figs. 0-0 show the bias relation for the same haloes 
as in previous figures, but now the mean and the scatter are 
measured in Eulerian space. The histograms show the cu¬ 
mulative Eulerian space distribution function, and the solid 
lines through the histograms show the cumulative General¬ 
ized Inverse Gaussians that have the same variance. As in 
the Lagrangian case, these cumulative curves are included 
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Figure 17. The same as the previous Figure, i.e., n = 0 and 
m = 32, but here a = 36.9. 

to show the range over which the simulations provide a good 
test of the theory; this range is where the cumulative curves 
are steep. The figures show that the theoretical curves for 
the mean Eulerian bias fit the corresponding quantities mea¬ 
sured in the simulations very well. This agreement has al¬ 
ready been shown by Mo & White (1996). What is new here 
is that our expressions for the scatter around the mean bias 
relation appear to describe that measured in the simulations 
very well also. The agreement at small R is particularly 
gratifying, since there the scatter is significantly less than 
Poisson. This shows that our model is able to account cor¬ 
rectly for volume exclusion effects. The agreement between 
theory and simulation when n = —1.5 is also encouraging. 
It suggests that our simple analytic model for quantifying 
the effects of volume exclusion is reasonably accurate even 
when the initial conditions are significantly different from 
white noise. 

5.5 Dependence of the mass function on local 
overdensity 

There is another way to show that the Mo & White Eule¬ 
rian space bias model is reasonably accurate. Equation ( |3^ 
shows that the unconditional, universal mass function n(M) 
is simply related to the conditional mass function N{M\S) 
of haloes that are known to be in Eulerian cells V which 
have overdensity <5 averaged over all values of <5. In the Mo 
& White model, N{M\S) is given by equation in gen¬ 
eral, it is different from (1 -|- 5) n{M)V. In particular, in the 
model, the shape of the mass function depends on the Eule¬ 
rian overdensity: the ratio of massive haloes to less massive 
haloes is larger in dense regions than in less dense regions. 
Figs. ^ and ^ show that this is consistent with what is 
measured in the simulations. 

These figures are similar to Fig. 1 of Lemson & Kauff- 
mann (1999). They show the conditional mass function 
N{M\S) for haloes in Eulerian cells V that have overden¬ 
sity S, for a range of choices of S and V. The top left panel 
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Figure 18. The same as the previous Figure, but for n = —1.5, 
m = 32, and a = 6.07. 

shows the range —0.8 < S < —0.4, the top right shows 
—0.5 < S < —0.1, bottom left shows 0.3 < S < 0.7 and 
bottom right shows 1.2 < <5 < 1.8. The three sets of curves 
in each panel show different cell sizes: R/L = 0.04 (bottom). 
0.08 and 0.16 (top). The histograms show (1 -|- S) times the 
largest cell volume V times the unconditional mass func¬ 
tion measured in the simulations. The associated dashed 
curves show (1 -|- 5)1/ times the Press-Schechter formula for 
the universal unconditional mass function with 5c = 1.7. 
The dashed curves provide good but not perfect fits to the 
histograms. Changing the cell size on a log-log plot simply 
changes the amplitude of the curves, so for smaller cell sizes 
we only show the analytical formula. The solid symbols show 
the actual conditional mass function measured in the simula¬ 
tions and the bold curves show the conditional mass function 
of equation (^^. The symbols differ from the histograms in 
the same way that the solid curves differ from the dashed 
curves. (The bottom right panel has only two sets of sym¬ 
bols because there were no large cells with the given range 
in 5.) This shows explicitly that, just as the Press-Schechter 
formula provides a reasonable fit to the unconditional mass 
function averaged over all Eulerian cells, the Mo & White 
model provides a reasonable fit to the mass function if only 
cells of a certain density range are used when computing the 
average. 

The data points show the mean number of haloes in 
cells V that are known to have overdensity 5. Since not all 
cells have the same number of haloes, there is some scatter 
around this mean. Our extension of the Mo & White model 
allows us to predict the rms ‘error bars’ on the data points. 
We have not shown them here. 


5.6 Eulerian space halo correlation functions 

This subsection compares the Eulerian space halo-mass and 
halo-halo correlations measured in the simulations with the 
theoretical model developed in the previous sections. To do 
this requires knowledge of the distribution function of the 
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Figure 19. The mass function of haloes that are in Eulerian 
cells V which have overdensity <5. Haloes were identified at an 
expansion factor of a = 36.9 in the simulations with white noise 
initial conditions. The three sets of curves in each panel show 
results for three cell sizes: R/L = 0.04 (bottom), 0.08 and 0.16 
(top). Filled symbols show the average number of haloes in those 
Eulerian cells that have overdensity S. The histogram shows (l+<5) 
times the universal mass function times the largest cell size. On 
a log-log plot, it has the same shape but a different amplitude for 
the other cell sizes. The dashed curves show the corresponding 
theoretical curves: (1 + i5)V times the universal mass function. 
The solid curves show the mass function computed using the Mo 
& White bias model of equation (^^. 


probability that a randomly placed Eulerian cell of size V 
contains mass M. Although Section ^ discussed how the Mo 
& White approach can be extended to derive this distri¬ 
bution self-consistently, here we simply follow the approach 
used by Mo & White. Namely, we will use the Eulerian prob¬ 
ability distribution functions measured in the simulations 
themselves (and, in fact, we will use the Generalized Inverse 
Gaussian fits to these distributions), rather than the ones re¬ 
quired by self-consistency. Recall that this means that there 
is no longer any guarantee that the model gives the correct 
number density of haloes. Below, we will show explicitly that 
the model is not self-consistent on small scales. 

Figs. pl|-p^ show the result of comparing the Mo & 
White model with the Eulerian space distributions mea¬ 
sured in the simulations. The top panels in each hg- 
ure show N{> m|\^)/n(> m)V, the middle panels show 
Chm(> ni\V)/^m{V), and the bottom panels show ^hh(> 
m|y)/^m(V') as a function of Eulerian scale. The symbols 
show the quantities measured in the simulations, and are 
coded similarly to those in the corresponding Lagrangian 
space plots. The solid curves show the theoretical quanti¬ 
ties. 

If the Mo & White model were self-consistent, then the 
theoretical curves in the top panels of each hgure would 
be unity on all scales. Thus, the figures show that the Mo 
& White model is inconsistent on small scales. The mid¬ 
dle panels show that, despite this inconsistency, the model 
provides a good ht to the Eulerian space cross correlation 
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Figure 20. Same as the previous figure, but for n = —1.5 initial 
conditions, and an expansion factor of 6.1. 

between haloes and mass. This is primarily a consequence 
of the fact that the mean Eulerian bias is well reproduced 
by the Mo & White model (Figs. ^6| —These curves are 
similar to those shown in Fig. 4 of Mo & White (1996). The 
bottom panels should be compared with Fig. 5 of Mo & 
White (1996). Whereas their model curves increase as R/L 
decreases, ours do not. Thus, our model for the volume aver¬ 
aged halo-halo correlation function works significantly bet¬ 
ter than the one they used. This is to be expected, since our 
model explicitly takes account of volume exclusion effects, 
whereas theirs did not. The bottom panels also show that, 
on sufficiently large scales, one consequence of dynamical 
evolution is to make massive haloes more strongly clustered 
than less massive ones. This is in agreement with earlier pre¬ 
dictions (Gole & Kaiser 1989; Mo & White 1996) as well as 
with the model developed here. 


6 DISCUSSION 

Numerical simulations show that haloes are biased tracers 
of the matter distribution. This bias depends nonlinearly 
on scale and on halo mass, and the bias on any given scale 
is stochastic. This paper describes an analytic model which 
describes this nonlinear, stochastic biasing, as well as its 
evolution, reasonably accurately. 

The model is consistent with the assumption that dis¬ 
connected volumes in the initial Lagrangian space may be 
treated as being mutually independent. This assumption al¬ 
lows one to use quantities associated with the merger histo¬ 
ries of dark haloes to estimate the Lagrangian space corre¬ 
lation functions of these haloes. The assumption of indepe- 
dence is most likely to be accurate if the initial distribution 
was Poisson or Gaussian white noise. The Poisson model is 
described in detail in Appendix where various subtle is¬ 
sues involved in this approach are discussed rigourously. In 
the limit of small fluctuations and large numbers of parti¬ 
cles, statements about clustering from Poisson initial con- 
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Figure 21. Various Eulerian space quantities as a function of 
Eulerian cell size. Top panel shows N(> m|V)/n(> m)V, mid¬ 
dle panel shows 5hm(>™l^)/$m, and bottom panel shows 5hh(> 
m|V)/^m. Filled circles, triangles, squares and stars show results 
for haloes in the simulations that contain more than 32, 64, 128, 
and 256 particles, respectively. Solid curves show the model pre¬ 
dictions. 
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Figure 22. Same as the previous figure, but for haloes identified 
at a later output time. 


ditions are easily related to those that describe clustering 
from white noise initial conditions (Sheth 1995, 1996). 

Section showed these expressions for the mean and 
higher order moments of the halo distribution, for white 
noise initial conditions. The final expressions compliment 
and extend those derived by Mo & White (1996). In par¬ 
ticular, the results of this section allow one to account for 
volume exclusion effects which arise from the fact that haloes 
initially occupy a volume that is proportional to their mass. 
These effects were described, but not quantified by Mo & 
White. Our results also include the effects of the scatter 
among different formation histories of individual regions in 
the initial conditions on the statistics of the halo distribu¬ 
tion in space—another effect that was described, but not 
quantified, by Mo & White. 

Whereas disconnected volumes are mutually indepen¬ 
dent in the white noise case, this is not true for more gen¬ 
eral Gaussian initial conditions. However, Sheth & Lemson 
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Figure 23. Same as the previous figure, but for initial conditions 
with a power spectrum with slope n = —1.5. 

(1998) showed that it is possible to provide a good approx¬ 
imate description of the forest of merger history trees asso¬ 
ciated with haloes which form from initial conditions with 
large scale correlations by simply ignoring these correlations. 
In the Mo & White model, knowledge of the merger history 
trees is equivalent to knowledge of the spatial distribution 
of dark haloes. Section ^ used this fact to argue that the 
white noise results could be used to provide simple ana¬ 
lytic approximations for the higher order moments of the 
Lagrangian space halo distribution even when the initial 
power on large scales is substantial. The Sheth & Lemson 
merger tree results suggest that these analytic approxima¬ 
tions should also be reasonably accurate. 

As a result of dynamical evolution, the evolved halo 
distribution is different from that in the initial Lagrangian 
space. To describe the evolved distribution we used the 
spherical model, in the way suggested by Mo & White, to 
relate the initial halo distribution described above to the fi¬ 
nal evolved one. We showed that in addition to allowing one 
to estimate the evolved halo-mass and halo-halo correlation 
functions, the Mo & White model could have been used to 
compute the Eulerian space probability distribution func¬ 
tion of the dark matter itself. This is a potentially useful 
extension of their model. 

Once the model had been fully specified, we compared 
it with numerical simulations of hierarchical gravitational 
clustering. Comparison with the halo distribution in the sim¬ 
ulations (Section P) showed that while the Mo & White bias 
model is reasonably accurate when describing the mean La¬ 
grangian space bias relation of massive haloes, it predicts the 
wrong mean value for less massive objects. Our extension of 
the Mo-White model allows us to compute the higher or¬ 
der moments of the bias relation. For massive objects (those 
for which the Mo-White mean is accurate), it describes the 
scatter around the mean well. For less massive objects, when 
the Mo-White model gets the mean value wrong, our model 
for the scatter around the mean is still in qualitative agree¬ 
ment with the simulations. 

Results for the halo distribution in Eulerian space were 
more encouraging. The Mo & White model describes the 
mean properties of the bias relation in Eulerian space well, 
for a larger range of masses than in the Lagrangian space. 
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and our extension of the model is able to describe the scat¬ 
ter around this mean well. Our model works even on scales 
where volume exclusion effects are important. This is very 
encouraging, since our model provides simple, analytic ex¬ 
pressions for these higher order moments. Although our sim¬ 
ulations do not have the dynamic range to investigate a 
large mass range, those of Jing (1998) do. Jing finds that 
on large scales where is constant, the Eulerian space 

low-mass halo distribution is more clustered than the Mo 
& White model predicts. In other words, he hnds that the 
large scale mean bias relation between low-mass haloes and 
the mass is larger than the mean bias relation that the Mo & 
White model predicts. It is interesting that this is the same 
trend we found in our study of the Lagrangian space halo 
distribution. This has an important consequence. 

The Mo & White model has two parts: the first is a 
model of the initial number density and spatial distribution 
of haloes, and the second models their subsequent dynamical 
evolution. Given only Jing’s result, one might have thought 
that the Mo & White fails only in the second step; that 
using the spherical model to translate from Lagrangian to 
Eulerian space is inaccurate. If so, one might have thought 
that the Zel’dovich approximation, or variants of it, could 
be combined with the initial distribution described here to 
derive accurate estimates of the evolution of the spatial dis¬ 
tribution of massive as well as less massive haloes. This is 
the sort of approach taken by Catelan, Mataresse & Porciani 
(1998). To date, they have only studied the halo distribu¬ 
tion on scales larger than that of a typical halo, since their 
approach does not allow them to account for the effects of 
volume exclusion. Since we are able to account for volume 
exclusion, it may be interesting to combine some of the re¬ 
sults presented here with their work. 

However, our results show that the Mo & White model 
fails in Lagrangian space: it does not describe the initial spa¬ 
tial distribution of low mass haloes correctly. This is not so 
surprising, since it is well known that the spherical model 
for the collapse of haloes, on which the first step of the Mo 
& White model is based, is more likely to be accurate for 
massive objects than for less massive ones (e.g. Bernardeau 
1994). If it is not so much the spherical model of the evolu¬ 
tion of the halo distribution, but rather the spherical collapse 
model for the formation of small mass haloes itself that is 
wrong, then we expect the discrepancy Jing measures for 
the Eulerian space distribution of the haloes in his simula¬ 
tions to be reflected in the shape of the unconditional mass 
function. The mass function in the simulations does indeed 
differ from the Press-Schechter function, and this difference 
is in the correct sense: whereas the theory predicts approxi¬ 
mately the correct number of massive haloes, there are fewer 
low mass haloes in the simulations than the Press-Schechter 
formula predicts. Quantifying this relation between the un¬ 
conditional mass function and the large scale bias relation 
is the subject of ongoing work. 

In this paper we have gone to a fair amount of trouble 
to derive a realistic, accurate, analytic model for the scatter 
in the halo-to-mass bias relation. This is because knowledge 
of this scatter allows one to address a number of interest¬ 
ing problems, some of which we list briefly below. To relate 
these results to the observed distribution of galaxies is com¬ 
plicated. Galaxies are thought to form inside dark matter 
haloes (White & Rees 1978; White & Frenk 1991). Semi- 


analytic models of this galaxy formation process (e.g. Kauff- 
mann. White & Guiderdoni 1993) show that the number of 
galaxies which form in a given dark matter halo is stochas¬ 
tic. Lemson & Kauffmann (1999) showed that most of the 
physical parameters of a dark matter halo on which galaxy 
formation processes are expected to depend, while they may 
depend on the halo mass, are independent of the halo’s envi¬ 
ronment. Thus, their results suggest that quantities like the 
average number, or the scatter in this number, of galaxies 
in a dark matter halo ultimately depend on the halo mass. 
So it should be possible to provide semi-analytic estimates 
of the mean galaxy-number-to-halo-mass bias relation, as 
well as the scatter in this relation. When combined with 
our results for the mean and higher order moments of the 
bias between dark matter haloes and the underlying matter 
distribution, such a relation would allow one to relate the 
observed galaxy distribution to that of the underlying dark 
matter distribution. Thus, our expressions for the scatter in 
the halo-dark matter bias relation can be used to extend the 
results of Kauffmann, Nusser & Steinmetz (1997) to smaller 
scales. In addition, combining the galaxy number to halo 
mass bias relation with the dark halo to dark matter bias 
relation may allow one to compute estimates of the expected 
scatter in the Tully-Fisher relation, to study the bias associ¬ 
ated with estimating Ho from redshift distortions (Pen 1998; 
Dekel & Lahav 1998), to evaluate the compatibility between 
observations of the number density and correlation functions 
of objects at high redshift and various cosmological models 
(Mo, Mao & White 1998), and to model the evolution of the 
cluster-cluster correlation function in different cosmological 
models (Mo, Jing & White 1997). 

This paper has dealt primarily with the problem of 
quantifying the mean and higher order moments of the halo 
bias given the matter fluctuation held (e.g. ((5h|(5m)). The in¬ 
verse problem is equally, if not more, interesting. The prob¬ 
lem of estimating the mean and higher order moments of 
the matter huctuation held given the halo distribution (e.g. 
(Jm|(5h)). is the subject of ongoing work. 
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APPENDIX A: POISSON INITIAL 
CONDITIONS 


In this Appendix the excursion set approach is used to derive 
expressions for the unconstrained mass function, and then 
for the constrained mass function and associated merger 
probabilities. The approach follows and extends that of Ep¬ 
stein (1983) and Sheth (1995) in the following way. These 
earlier analyses considered spherical collapse around parti¬ 
cles in the initial Poisson distribution. However, in this pa¬ 
per we want to compute averages over all randomly placed 
cells in the initial distribution, not just those that are cen¬ 
tred on particles. So, in the next few subsections, we de¬ 
rive expressions for the constrained and unconstrained mass 
functions where the restriction to volumes centred on par¬ 
ticles has been dropped. It turns out that the modification 
to the previously derived expressions is trivial. Therefore, 
the first two subsections may seem a little pedagogical—we 
have included them to set notation. Readers familiar with 
the Poisson excursion set analysis may prefer to skip directly 
to Section A3. 

The spatial distribution of these haloes, in the initial 
Lagrangian space, is described in Sections A3-AE. Compar¬ 
ison of these results with numerical simulations is often done 
for haloes having a range of masses. There is some subtlety 
in doing this correctly—this is discussed in Section (A 6 ). 
That all these Poisson results are easily extended to de¬ 
scribe clustering from white noise initial conditions is shown 


in Section A7. Essentially, those statements about clustering 
from white noise initial conditions which are known (e.g., the 
conditional and unconditional mass functions, and the mean 
bias relation), can be derived by taking appropriate limits 
of the corresponding Poisson statements. The same limiting 
procedure can be used to derive statements about the higher 
order moments of the Lagrangian space halo distribution. It 
is these expressions that are presented in the main text. 


A1 The unconstrained mass function 


Consider a Poisson distribution of particles with mean den¬ 
sity h. This means that a volume of size V placed at a ran¬ 
dom position in this distribution will contain exactly N par¬ 
ticles with probability 


p{N,V) 


N\ 


(Al) 


Furthermore, if it is known that there are N particles in V2, 
then the probability that there are j particles in Vi placed 
randomly within V2 is 


p(j,VilN,V2) 


p{j,Vi) p{N-j, V2-V1) 
p{N,V2) 



Now choose a random position in the distribution, and 
compute the density within concentric spheres centred on 
this position. Call the curve traced out by the number of 
particles contained within a sphere V centred on this point, 
as a function of the sphere size V, a trajectory. Then each 
position in the Poisson distribution has its associated tra¬ 
jectory. Let /"(di) denote the probability that, for all con¬ 
centric spheres centred on a randomly chosen position, the 
density never exceeds the threshold value h(l + 5i). One 
way to compute this probability is to compute the fraction 
of trajectories for which N{V) < nV{l + 5i) for all V, where 
N{V) is the number of particles within V. This quantity can 
be computed as follows. 

Start with an arbitrarily small sphere centred at the 
chosen position, and consider successively larger concentric 
spheres. As the volume increases by an infinitesimal amount, 
the number of particles contained within the current sphere 
either remains the same, or increases by one. (Strictly speak¬ 
ing, the probability that the number of particles increases 
by one is an infinitesimal, the probability that the number 
increases by two is an infinitesimal of higher order, an in¬ 
crease by three particles is an infinitesimal of still higher 
order, and so on.) Therefore, a given value of defines a 
series of volumes Vi < V 2 < • . • for which 


j/Vj = n{l + 5i) = n/bi. (A3) 

The final equality defines 61 = l/(l-|-(5i), a parameter which 
will be useful later. The quantity of interest, /®(^i), is one 
minus the probability that Vj is the largest sphere centred 
at the chosen position that has density h(l + di), summed 
over all Vj. That is. 


1 - nsi) = ^p(j, Vj) r (hiii, Vj), (A4) 
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where the first term in the sum is the probability that Vj 
contains exactly j particles, and the second term expresses 
the probability that, given that Vj contains exactly j parti¬ 
cles, no concentric sphere larger than Vj is denser than it. 
Epstein (1983) shows that 

nSi\j,Vj) = j^^=l-h, (A5) 

and he discusses why it is independent of j. Thns, 

OO 

1 - r(5i) = (1 - for) Vj) = 6i, (A6) 

1=1 


this. Then F{j, bi) denotes the fraction of space that is asso¬ 
ciated with isolated (j, 6i)-volumes. If N{j,bi) denotes the 
number of such volumes, and Vv denotes the total volume, 
then 


F{j,bi) = N{j,bi) 


Yl 

Vu 


N{j,bi) jbi 

Vu n ’ 


SO that the number density n(j, 6i) of such isolated volumes 
is 


n{j,bi) 


T) 

— E(i,fei) =h(l-&i) 

JOl 

h(l - bi)ri{j,bi), 


j'- 


(A9) 


where the sum is simplified by recognizing that it is 6i times 
the first moment of the Borel distribution (Borel 1942). 
This shows that /'*(5i) = 1 — for, so that it is the same as 
f^{Si\j,Vj). This is simply a consequence of the fact that, 
since the distribution is Poisson, the probability that all 
larger volumes containing a given volume are less dense than 
a given value depends only on the density within the volume, 
and not on the number or the distribution of the particles 
within it. 

The expression above implies that the probability that 
at least one sphere centred on a randomly chosen position 
in a Poisson distribution is denser than h(l -|- 5i) is bi. In 
other words, of the infinity of spatial positions in a Poisson 
distribution, and, of the infinity of associated trajectories, 
only a fraction bi are at the center of at least one sphere 
that is denser than h(l -I- 5i). That is, only a fraction bi of 
the trajectories ever have N{V) > fiV{l -|- 5i) for at least 
one value of V. 

Let F{j, bi) denote the fraction of trajectories for which 
N{Vj) = j, and for which N(Vk) < k for all 14 > Vj. Then 

F{j,bi)=p{j,V,)r{Si\j,V,), (A7) 


where the first term gives the probability that a trajectory 
has N{Vj) = j, and the second term gives the probability 
that Vj is the largest volume at which the trajectory exceeds 
the threshold h(l -|- 5i). 

There is a useful relation between equations (0,(0) 
and (A.7). Let 2 I 4 = kb 2 /n. Then 


p{j,V)=^p{j,V\k, 2 Vk)F{k,b 2 ), 


(A8) 


k=j 


provided j/V > n, and 62 < &i = nVjj. To see this, note 
that the left hand side includes all trajectories that have 
value j at V. Suppose each trajectory is labelled by the 
value of k for which 2 I 4 is the largest volume at which that 
trajectory crossed the line h(l-|-52). Trajectories which cross 
the line for the final time with value less than j cannot also 
pass through V with value j. Therefore, the sum on the right 
hand side is only over those trajectories that cross the line 
h( 1-1-52) for the final time with k > j, and also pass through 
V with value j. Clearly, the left hand side must equ al the 
right. Direct substitution shows that equations (^), (A2) 
and (A.7) do satisfy this relation. The normalization and first 
moment of Consul’s (1989) generalized Poisson distribution 
aid in proving this result. 

Define an isolated region as a spherical region within 
which the average density is h(l -|-5i), and for which the av¬ 
erage density within all larger concentric spheres is less than 


where ri{j,bi) is the Borel(6i) distribution. Thus, the 
Borel(fei) distribution gives the probability that an isolated 
region contains exactly j particles [since ■ v{j,bi) — 1, 
and J 2 jj'nU,bi) = 1/(1 - &i)l. 

Following Bond et al. (1991), it will be convenient to 
associate thes e isolated regions with collapsed haloes. Then 
equation (A9) is the unconditional mass function, since it 
gives the number density of collapsed objects that contain 
exactly j particles. 


It is interesting to compare equations (A7) and (A9) 


with the results of Epstein (1983). In his analysis, Epstein 
only considered those trajectories that were certainly cen¬ 
tred on particles of the Poisson distribution. Here, that re¬ 
striction has been dropped. Let f{j,bi) denote the fraction 
of trajectories that are centred on particles and are asso¬ 
ciated with isolated regions containing exactly j particles. 
Epstein’s expression for /(/, 61 ) implies that 


F{j,bi)^bif{j,bi). 


(AlO) 


Thus, the effect of considering the set of all trajectories, 
rather than the subset that are centred on particles, is simply 
to introduce the 61 term. This is sensible. In the limit in 
which the threshold 5i —> 00 , foi —> 0. In this limit, the 
only trajectories that ever exceed the threshold are those 
that are centred on particles, and they exceed the threshold 
only when the volume is vanishingly small. In this limit, 
/O, bi) = 1 if j = 1, and it is zero otherwise. On the other 
hand, the subset of trajectories that are centred on particles 
is a vanishingly small fraction of the set of all trajectories, 
so that, as 5i —> 00 , the fraction of all trajectories that ever 
exceed 5i tends to zero. So, in this limit, F{j, bi) —> 0 for all 

j- 


A2 The constrained mass function 


The probability that a randomly placed volume iVj contains 
exactly j particles and has density h(l + 5i), and that the 
larger volume 2 I 4 > iVj including jVj contains exactly k 
particles, has density h(l + 82 ), and is isolated, is 


p{j, iVj) Pik-j, 2 I 4 - iVj) r{52\k,2Vk). 


Equation (A5) shows that /®( 52 |fc, 2 l 4 ) = (1 — 62 ). The prob¬ 
ability F{j,bi\k,b 2 ) that iVj is itself isolated within the iso¬ 
lated region 2 I 4 [that is, the average density within all vol¬ 
umes V that include iV) and are within 2 I 4 is less than 
h(l -I- 5i)) satisfies a recursion relation: 


F{j,bl\k,b 2 ) = 


pU, iVi,k, 2 Vk) nS 2 \k, 2 Vk) 
F[k,b 2 ) 
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k 

■E 

m>j 


F{m,bi\k,b2)p{j, iV,|m, iT4i). (All) 


The numerator in the first term on the right is the joint prob¬ 
ability above, in which iV,' is not necessarily isolated. The 
denominator is included since it is known that j/k is isolated. 
From this first term, we must subtract the probability that 
a volume iKn containing iV) was itself the largest isolated 
region within 2 Vfc. This is just the product of the probabil¬ 
ity F{m, bilk, & 2 ) times the probability p{j, iV^jm, iVm) that 
there were exactly j particles within iV) given that they were 
within the isolated region iVm, summed over all m larger 
than j. Now, 

p{j, iVj\m, iVm) = 


pU, iVj 

) p{rn - j, iVm - 

iVi) 


p{m, iVm) 


/m\ / 

iVSV/, iV)' 

\ ■m-j 


iVmJ V iVm 

) 

I'm — 1 

] V (1 

4 Y 

U'-i 

J \m) V 

mj 


since iVfc = kb\/n. This binomial-like term is necessary be¬ 
cause not all configurations of particles that contribute to 
F(m,h\\k,b 2 ) will have had exactly j particles within iV). 
Appendix shows that 


FUM\k,b2) = k[l- 


(‘-a 


k\ f_ 
j) k^ 




satisfies the recursion relation given above. 

Let f{j,bi\k,b 2 ) denote the corresponding expression 
for volumes iV) that are known to certainly be centred on 
a particle. Then f{j,bi\k,b 2 ) is given by equation (40) of 
Sheth (1995), and 


F{jM\k,b2) = {bi/b2) fUM\k,b2)- 


(A14) 


Thus, as with the statements F{j,bi), the expressions for 
randomly placed volumes are easily related to those for vol¬ 
umes that are centred on particles. The (bi/ 62 ) fac tor here 
plays the same role as the factor 61 in equation (AlO). It sim¬ 
ply reflects the fact that, for a Poisson distribution, the par¬ 
ticles within 2 I 4 are distributed as though they are part of 
a Poisson distribution with average density n{l + S 2 ), rather 
than w. Mo reover, the discussion in the final paragraph of 
section A1 applies to the limiting behaviour of F{j,bi\kb 2 ) 
as —> 00 , i.e., as 61 —> 0 , just as it did for the limiting 
behaviour of F{j, fei). 

The similarity between F{j) and F{j\k) can be made 
still more striking. Suppose there are k particles in the vol¬ 
ume 2 Vfc and j < k particles in the subvolume iV) within it. 
Then h(l + <5') = n{k — j)/{kb 2 — jbi) is the density in the 
remaining volume 2 I 4 — iV)', and 


F{j,b^\k,b2) = 


Si-6' 
l-b5i 


pU,iVj\k,2Vt), 


(A15) 


where p{j\k) is given by equation ([A^]). Equation (A7) shows 


that F{k,bi) is given by an analogous expression; there, the 
remaining volume is infinite, so that the overdensity in it, 
5' , is 0 by definition. Thus, F{j\k) is related to p{j\k) in the 
same way that F{j) is related to p{j). 


Recall that, although F{j, bi) differed from f{j, bi), the 
final expression for the number density of isolated ( 4 , 61 ) 
volumes was the same for randomly placed volumes as for 
volumes centred on a particle (equation A0). The same is 
true here. If A/'( 4 , 6 i|fc, 62 ) denotes the average number of 
isolated ( 4 , 6 i)-volumes within a randomly placed (^, 62 )- 
volume, then 

F{j,bi\k,b2) =Af( 4 , 6 i|fe, 62 ) ^ = Af( 4 ,bi|fc, 62 ) 

2V4 ^02 

SO that 

kbo h 

M{jM\k,b2) = F{j,bi\kM) = ^ f{j,bi\kM)- (A16) 

401 4 

The final expression is the s ame as equation (45) of Sheth 
(1995). Thus, equation (A16) shows that the average number 


of ( 4 , &i)-volumes that are within a (fc, & 2 )-volume is the same 
when 2^4 is placed randomly in the Poisson distribution as 
when it is centred on a particle. In ter ms o f collapsed haloes, 
this expression is similar to equation (A9), except that here 
the ( 4 , 6 i)-halo is constrained to be within a (fc, 62 )-halo. 
Thus, this expression gives the conditional mass function. 
Notice that 

N{j,bi\k,b 2 ) ^ (k/j) /( 4 , 61 / 62 ) when k > 4 . (A17) 


Comparison with equation (A9) shows that, in this limit, the 
number density of ( 4 , 6 i)-volumes that are within a (^, 62 )- 
volume is the same as in the unconstrained case, the only 
difference is that 6 —> 61 / 62 , which reflects the fact that the 
background density within 2 I 4 is h(l -|- 62 ), rather than h. 

All the arguments above were phrased entirely in terms 
of volumes that were concentric spheres. This was done with 
a view to improving the clarity of the presentation—the en¬ 
tire analysis applies unchanged for arbitrarily shaped vol¬ 
umes. This is because the underlying distribution is Pois¬ 
son, so that all statements depend only on volumes V and 
not their shapes, and all volumes can be broken up into 
mutually independent sub-volumes. This is also why the di¬ 
mensionality of the point distribution does not enter into 
the analysis anywhere. Appendix ^ here shows this explic¬ 
itly. In this respect, the statements above are obtained by an 
averaging process that is similar in spirit to that described 
in the Appendix of Bower (1991). 


A3 Clustering of haloes in Lagrangian space: the 
mean number of haloes 

This section derives the first moment of the distribution of 
halo counts in randomly placed cells. The following sections 
describe the distribution of haloes in randoml y pl aced cells 
when the halo mass is specified, and Section 
the distribution for a range of masses. 


A 6 considers 


To compute the mean number of haloes in randomly 
placed cells, it is useful to consider another way of computing 
F{j,bi). This alternative method also shows that dropping 
the Epstein (1983) and Sheth (1995) restriction (to only 
those volumes that are centred on particles) makes only a 
trivial difference to the final expression for F{j,bi). 

Let f^{N, Vo) denote the probability that there are ex¬ 
actly N particles within the sphere Vo, given that Vo is cen¬ 
tred on a randomly chosen particle in the Poisson distribu¬ 
tion. Then 
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f{N,Vo)=p{N-l,Vo), (A18) 

where p{k,Vo) is given by equation ( |A1|) . The probabil¬ 
ity that there are j particles in the sphere Vj centered on 
the chosen particle, and all concentric spheres V satisfying 
Vj < V < Vo are less dense than Vj, given that there are N 
particles in the concentric sphere Vb > is f{j,bi\N,bo), 
where 1 < j < N, bi was defined above, and bo = nVo/N. 

Let A^oi denote the largest integer less than nVo/b\\ it 
is the maximum number of particles that may be in Vo, if 
Vo is to be concentric to and less dense than Vj. Also, let 
Vb) denote the probability that no sphere 1^ > Vb 
is concentric to and denser than the sphere Vj, given that 
there are exactly 1 < < Aoi particles within Vb- This 

quantity is just one minus the probability that there exists 
a sphere Vb > Vo which is the largest sphere concentric to 
and having the same density as Vj, given that there are N 
particles within Vb- Then, 


r{bi\N, Vo) = l- ^p(fc -N,Vk- Vo)r{br\k, Vb), (A19) 

k>NQi 


and equation (A5) shows that we can replace /®(6i|fc, Vfc) 

(A20) 


with (1 — bi). Define 
Q{bi,N,Vo)=p{N,Vo)r{bi\N,Vo). 


Equations (A2), (A7) and ( |A19| ) imply that 

OO 

Q{b^,N,Vo)=p{N,Vo)- '^p{N,Vo\k,Vk) F{k,h 2 ).{K 2 l) 

fc>iVoi 


This, with equation (A8), shows that Q = 0 when N > A^oi- 
In terms of these quantities, 

Noi 

fU, bi) = ^ fU, 6i|iV, bo) f\N, I/o) fibilN, Vb). (A22) 

N=j 


Now, equation (AJ) implies that bo = fiVo/N, so this sum 
expresses f{j,bi) in terms of volumes Vb th at a re cer tainly 
centred on a particle. However, equations ( |Al[ ) and (A18) 
show that 

f{N, Vb) = p{N - 1, Vb) = ^ p{N, Vo) = p{N, Vo)/bo, 

nVo 


-^01 

F{jM) ^Y.F{j,bi\N,bo)p{N,Vo)r{bi\N,Vo). (A23) 

N^j 


This final expression is written entirely in terms of ran¬ 
domly placed volumes, since /®(bi|Af, Vb) depends only on 
the fact that there are exactly N particles within Vb, and 
not on whether or not one of those particles is at the centre. 
Straightforward but tedious algebra shows that this sum is 
consistent with the expressions for f{j,bi) and F{j,bi) de¬ 
rived earlier. 

This calculation can be easily manipulated to give the 
average number of isolated (j, 6i)-volumes that are in ran¬ 
domly placed cells of size Vb. It is 

Nqi 

n{j,bi)Vo = ^A/'(j,6i|bV,6o)Q(bi,iV,Vb). (A24) 

N=j 


The sum on the right is (nVo/jbi) times the one in equa¬ 
tion (A2f), so it is equal to hVb f{j, bi)/j. Comparison with 


equation (AS) shows explicitly that the mean number of 
isolated (j, fei)-volumes that are in randomly placed cells of 
size Vb is Vb times the average density of these haloes, as 
required. 


A4 Cross correlation between haloes and mass 


It is also straightforward to compute a measure of the cross 
correlation between (j, 6i)-haloes and the total number of 
particles that are in randomly placed cells of size Vq. 

Recall that J\f{j,bi\N,bo) denotes the average number 
of (j, bi)-haloes within an (A, bo)-halo. This expression also 
represents the average number of (j, bi) isolated regions that 
are within isolated regions Vb which each have density N/Vq. 
Since these regions are isolated, they are different from a 
random region of size Vb containing N particles; recall that 
only a fraction /®(fei|A, Vb) of such random regions may 
contain a 6i-halo (and, of course, the number of particles 
in the bi-halo may not exceed N). The average number of 
(j, fei)-haloes in the remaining Vb cells (those that contain 
exactly N > j particles and are not isolated) is zero. 

Thus, the average overabundance of (j, bi)-haloes 
within the fraction /®(6i, N, Vb) of randomly placed Vbs that 
are isolated is 


s];{j,bi\N,bo) 


mj,bi\N,bo) 

n{j,bi)Vo 


(A25) 


(Mo & White 1996), and bj;' = — 1 in the remaining Vbs. The 
superscript L represents the fact that this expression defines 
a bias relation that is associated with randomly placed re¬ 
gions Vb in the initial Lagrangian space. As Mo & White 
(1996) note, in general, dynamical evolution will result in a 
bias relation in Eulerian space that is different from this one 
in the Lagrangian space. Notice that, because is the av¬ 
erage overabundance of haloes, it depends only on the first 
moment of the halo distribution. To compute the rms scat¬ 
ter around this mean value requires knowledge of the higher 
order moments of the halo distribution. We will compute 
this scatter later in this paper. 

When N > j, f{j, 6i|iV, bo) f{j, bi/bo) (Appendix B 
in Sheth 1996), and /®(6i|A, Vb) ^ 1, so 


5l:{j,br\N,bo) ^ — 


N f{j,bi/bo) 


nVo fij,bi) 


- 1 . 


(A26) 


This relation will be useful later. 
Define 


^lm(i,&ii^o) = (b;)(i|o) bo) 


where bj((l|0) is given by equation (A25), 
A ^ 1 


(A27) 


and the average above is over all randomly placed Vo. Writ¬ 
ing all the terms out explicitly gives 


_ / A/'(1|0) N \ /N\ 

\n{j,bi)VonVo) \fiVo) 


A^( 1 | 0 ) 

nU,bi)Vo 


+ 1 ) 


(A28) 


where n(ji,bi) is given by equation (A9), and A/'(1|0) by 
equation (AH). The second term in this expression is 
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(N/fiVo) p{N,Vo), summed over all N, so it is unity, and 
it cancels the fourth term. The hrst and third terms have 
A/'(1|0) = 0 if they are not isolated, so they only recieve a 
non-zero contribution from the fraction /®(6i|A'^, Vb) of cells 
that are isolated. Writing the sum whic h gives the average 
explictly, and then using equation (A20), shows that 


JVoi 

fimU,bl\Vo) = 

N=j 


A^( 1 | 0 ) 

n{j,bi)Vo 


Q{bi,N,Vo). 


(A29) 


The upper limit on the sum comes from the fact that, if a 
randomly placed Vb were to contain more particles, then it 
would be denser than 6i, so the {j, bi)-regions inside it would 
not be isolated, and A/'(1|0) = 0. This final expression is the 
cross correlation between (j, 6i)-haloes and mass, averaged 
over all randomly placed Lagrangian cells Vb. 


A5 The higher order moments of the halo 
distribution 


Previous subsections computed the mean number of isolated 
(j, 6i)-regions, i.e., the mean number of (j, 6i)-haloes that 
are in randomly placed cells of size Vb. This subsection com¬ 
putes the higher order moments of the distribution. To do so, 
it is necessary to examine the expression for J\f{j,bi\N, bo) 
in more detail. 

Let p(n, 6i|A, bo), where n = (ni, • ■ • , njv) and bo > bi, 
denote the probability that the volume Vb = oVjv is com¬ 
posed of m isolated subvolumes, of which there are nj iso¬ 
lated (j, bi)-volumes (each of size iVj), and 1 < j < N. 
Thus, X/j-i mass conservation requires that 

= N- Sheth (1996) describes a model, based on 
the Poisson distribution, in which 


p{n,bi\N,bo) 


(bVboi)^-ie-^'’°i A ^(i,bi)"^- 
rj{N,bo) rij! 


(A30) 


where boi — {bo — bi), and Nbo = nVo- See Sheth & Pitman 
(1997), and Sheth & Lemson (1998), for other interpreta¬ 
tions of this partition formula. 

For this model, the average number of isolated regions 
containing exactly j particles, each with average density 
parametrized by bi, that are within spheres of size Vb con¬ 
taining exactly N particles is given by 


where Nbo = nVo- This is the overdensity of {j, bi)-haloes in 
the partition n of N, relative to the average density of such 
haloes. This, averaged ov er all partitions, gives the average 
bias relation of equation (A25): 


St(j, bijN, bo) = N, bo) p{n, bi|7V, bo). (A33) 


The variance in this bias relation is 

Var(A,) = (^N]{b^,n,N,bo)) - (^^j{bi,n,N,bo))\ (A34) 

where the average is over all partitions n of N. This is the 
same as 

\ / \ 2 

Var(Aj) = 


n^{bi,n, N,bo)'^ (^nj{bi,n, N,bo)'^ 


[■n{j,bi)Vo]^ [n(j,bi)Vb]2 

where the averages are over all partitions n of N. The first 
term is the second moment of the distribution of (j, bi)- 
subhaloes within {N, bo)-haloes. The rms scatter around the 
mean bias relation is the square root of Var(Aj). So, to 
compute the scatter in the bias relation requires knowledge 
of the second moment of the halo dist ribut ion. Fortunately, 
for the model described by equation (A3C), all such higher 
order moments are known. 

The factorial moment of order a, of the distribution of 
(j, bi)-haloes within (A, bo)-haloes, is 


Mc(j,bi|A,bo) = 



N, boj 

V°‘{j,bi) v{m,B) 
n{N,bo) 


(A35) 


where 

mB = {N — aj)B = Nb2 — ajbi 


(A36) 

(Appendix B of Sheth 1996). Similarly, cross-moments are 
given by 


nj\ 


(m - a)! {rij -/?)! 


,bi 


A, bo, 


A(bo - bi) 


1 a+/3 


Ti°‘{i,bi)r]^{j,bi) n{m,B) 
v{N,bo) 


, (A37) 


where 


{nj,bi\N,bo) = p(n,bi\N, bp) 

■K[n] 

N 

= -/(j.bi|A,bo) =V(j,bi|A,bo), (A 31 ) 

where 7 r[n] denotes the set of all distinct ordered partitions 
of A (Appendix B in Sheth 1996). 

To set notation, it is useful to rewrite some of the 
expressions derived earlier. Let nj{bi,n, N,bo) denote the 
number of (j, bi)-haloes in the partition n of A. (In the 
formula above, this was simply written as Uj.) Then 

^(jibbilA, bo) = ^nj(bi,n, A,bo) p(n, bi|A, bo). 

7r[Ti.] 


Define 


Aj(bi,n, A, bo) 


Wj(bi,n, A, bp) _ ^ 
n{j, bi)Vb 


(A32) 


mB = (A — ai — Pj)B = Abp — aibi — Pjbi. 


(A38) 


These formulae for the highe r order moments were ob¬ 
tained after using equation ( A3C ) for the partition formula. 
Sheth & Lemson (1998) show that this formula arises natu¬ 
rally as a consequence of the fact that disconnected volumes 
in a Poisson distribution are mutually ind epen dent. This al¬ 
lows a simple interpretation of equation ( A37 ). 

Dehne 

c{i,j,bi\k,bo) = N{j,bi\k,bo) N{i,bi\k - j,d) 

= (rij, bi|fc, bo) (ni, bi|fc - j,b'), (A 39 ) 

where 


b' “ oI4 - iVj - + )• 


(A40) 


The halo containing j particles can be thought of as occu¬ 
pying iVj of the total volume pVk. Thus, b' parametrizes the 
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density in the remaining volume oVfe — {Vj, which contains 
(k — j) particles. Thus, c{ij\k) is the product of the mean 
number of (jf, 6i)-haloes within the volume associated with 
the {k, 6o)-halo and the mean number of {i, 6i)-haloes in the 
remaining volume, given that there is a (j, fei)-halo within 
oVfc. Now, equation (A31), implies that 


c{i,j,bi\k,bo) = ^ f{j,bi\k,bo) f{i,bi\k - j,b') 

J * 

v{k,bo) 

X {k- j){b' - bi) 

^ V{i,bi)vik- j - i,b'') 
ri{k - j, b') 

where b" is defined similarly to b'. That is, 
n k — j — i 


(A41) 


(A42) 


b” oVk - iVj - iVi 
However, 

(k - j) (b' - for) = (kbo - jbi) -{k- j)bi = k{bo - 6i), (A43) 
so that 

c{i,j,bi\k,bo) = V{j,bi)v{i,bi)v{k - j - i,b"). 

This expression is symmetric in i and j, and it is easy to see 
that it is the same as 


c(i,j, bi\k,bo) = J^{i,bi\k,bo) J^{j,bi\k - i,b'), 


(A44) 


with the appropriate redefinition of b'. Simple algebra shows 
that 


c(i,j, bi\k,bo) = (ni nj, &i|fc, 6o), 


(A45) 


where the right hand side is equation (A37) with a = /3 = 1. 
This shows explicitly that 

(ninj,bi\k,bo) = {nj,bi\k,bo) (ni,bi\k - j,b'') , (A46) 

and that it was obtained by treating the volumes iV^' and 
ol4 —iVj as being disconnected from, and independent of, 
each other. 

This argument can be generalized to the higher order 
moments. For example, if 


(k — nj) = kbo — njbi, with 6^°^ = bo, 
then 

{k - nj) (6^"^ - 6i) = k{bo - 6i). 


(A47) 

(A48) 


So equation (A35) is also equal to 


Ct — i 

bLa{j,bi\N,bo) = 

n=0 
a — 1 

^ n ^(j^ki\k-nj,b^'^'>y 


(A49) 


Thus, the higher order moments described by equa¬ 
tion ( A35 ) are consistent with the fact that disconnected 
volumes in a Poisson distribution are mutua lly in dependent. 
The cross correlation moments of equation (A37) can be in- 
terpretted similarly. Thus, for example, the variance in the 


bias relation above is 

c{j,j,bi\N,bo)+J^ij,bi\N,bo) 


Var(Aj) = 


[n{j, 6i)Ho]2 
Af(j,bi|jV,bo)" 
[n{j,bi)Vo]‘^ 


Equation (A31) in (A24) implies that 


iVoi 


n{j,bi)Vo = {nj,b^\N,bo) Q{b^,N,Vo). 

N=j 


(A50) 


(A51) 


This shows how the average number of isolated regions, each 
with average internal density ri(l -|- (5i) and each containing 
j particles, that are within randomly placed volumes V p, ca n 
be obtained from the partition formula of equation (A30). 
The main reason for writing this expression explicitly is that 
it shows clearly how to compute the hi gher order moments 
associated with the model of equation (A30). 

Let Ma{j,bi) denote the ath factorial moment of the 
distribution of {j, 6i)-regions that are within spheres of size 
To. It is obtained by a similar average to that for the mean: 

^01 

Mo,{j,bk\Vo) = f^c,{j,bi\N,bo) Q{bi,N,Vo). (A52) 

N=cxj 


When a = 1, this is the same as equation (A51). 

Let ^h(*i|0) denote the correlation between isolated 
{i,bi)- and (j, fei)-regions, averaged over Lagrangian cells of 
size Vo- Then 

M 2 (j,fei|To) = (h(j,&i)Ho)'(l +Clh(ji|0)) . (A53) 

Similarly, when the foi-isolated regions do not have the same 
number of particles. 




c{i,j, foi|iV, bo) 


Nqi 

V 

n{i,bi)Von{j,bi)Vo 

N=i-\-j 


Q(foi,A,Vb),(A54) 


with the understanding that c{ij\N) = 0 if (i + j) > N, so 
that C?7h(y|0) = -1 if (i -l-i) > Noi. 

Suppose that each isolated 6i-region within Vo is repre¬ 
sented by (a randomly chosen) one of its constituent parti¬ 
cles. This defines a point process, for which statistics such 
as the distribution of halo counts-in-cells can be computed. 
Since (j, &i)-regions are associated with (j, &i)-haloes, it is 
convenient to call the randomly chosen representative point 
of such a halo its centre-of-mass. The expressions above give 
the higher order moments of the distribution of counts of 
haloes in randomly placed cells To. Halo-halo correlations 
can b e com puted from these moments. For example, equa¬ 
tion (A54) gives the volume averaged correlation function 
of (i,6i)- and (j, foi)-haloes. All the necessary sums can be 
evaluated analytically. 


A6 Statistics for a range of halo masses 

The previous subsections considered the halo distribution 
when the halo mass was specified. This subsection shows 
how to compute correlations between haloes that have a 
range of masses. This is necessary, since comparison with 
simulations is typically done by considering averages over a 
range of masses, and, as we discuss below, the transition to 
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considering a range of masses is not completely straightfor¬ 
ward. That is, simply integrating the previous expressions 
over the relevant mass range, weighted by the unconditional 
mass function, is not entirely correct. It turns out that, over 
a large range of scales, the correct expression yields only a 
minor correction to the naive expression, so readers inter¬ 
ested only in results may prefer to skip this section. 

So far, the distribution of isolated regions and that of 
the centre-of-mass distribution of collapsed haloes were as¬ 
sumed to be the same. However, there is an important dif¬ 
ference between haloes and isolated regions. Namely, by def¬ 
inition, a Lagrangian volume Vo with overdensity do cannot 
contain an isolated Vi < Vo region of overdensity di < do, 
nor can it contain an isolated region of density di < do if its 
size is Vi > Vq. Thus, the number of such isolated regions 
within an overdense or non-isolated cell Vo is zero. 

However, since a collapsed halo is represented only 
by the volume element associated with its centre-of-mass, 
haloes are said to lie within a cell if their centre-of-mass 
does. Thus, an Mi halo may well lie within a Vo cell, even 
if Ml > Mo. Moreover, in the model, a region Vo of density 
do > di is certainly a subregion of an isolated di region, with 
Vi > Vd. Such an overdense Vd is said to contain the Mi halo 
only if the volume element that represents the centre-of-mass 
of the halo falls inside it; in the model, the centre-of-mass 
is a randomly chosen volume element, so this happens with 
probability (Vd/Vi). Thus, a cell Vd that is either overdense 
or not isolated may contain a halo, whereas, by definition, 
it cannot contain an isolated region. Previously, this differ¬ 
ence between haloes and isolated regions was unimportant. 
Now, however, since we must integrate over a range of halo 
masses, it can be important. 

Consider the set of Vd cells placed randomly in the La¬ 
grangian space. Suppose we wish to count up the number 
of 6i-haloes that are more massive than m, that are in such 
cells. Given a value of 6i, these cells can be divided into 
two classes: those that are isolated and those that are not. 
Those that are isolated can be classified by the number N 
of particles within the cell. All isolated cells that contain N 
particles can be further classified by the way in which the 
N particles are divided into fei-haloes. Consider an isolated 
cell Vd that is known to contain exactly N particles which 
are partitioned into 6i-haloes. As before, denote the par¬ 
ticular partition by the vector n. Let Nh(j, 6i |n, 6o) denote 
the number of (j, fei)-haloes that are within such a cell. The 
number of 6i-haloes more massive than m that are within 
such cells is 


N 

Nisoii>m,bi\n,Vo) = ^ Nh{j,bi\n,bo). (A55) 

j>m 


Equation (A.31) shows that this quantity, averaged over all 
partitions of TV, is 


N 

Nisoi{>m,bi\n, Vd) p{n-bi\N, bo) = y^ (nj,bi\N, bp). 

7r[Ti] j>Tn 


This, averaged over all values of TV, is 


N 

iVisoi (> m, 6i I Vd) = y]] 0(6i, N, Vd) y]] (n,', 6i I A, bo), (A56) 

N j>m 


since Q{bi, N,Vo) denotes the fraction of the total number 


of cells that are isolated. This sum is zero when N < m, 
because if the cell Vd is isolated, then all the particles asso¬ 
ciated with a halo within Vd must be contained in Vd, and 
we are only counting haloe s more massive than m. The defi¬ 
nition of Q (equation A20) insures that the sum is also zero 
when N > Npi- This is because, when N > Aoi, then the 
cell is denser than bi, so it is not isolated on the scale Vd. 
The order of the sums above can be interchanged to yield 


Noi Nqi 

iVisoi(>m,bi|Vd) = EE Q(bi,A,Vd) A/'(i,bi|A, bo) 

j>m N=j 
Noi 

= y]h(j,bi)Vd, (A57) 

j>m 


where the final equality follows from equation (A24). 

Cells that are not isolated on scale Vd can be classified 
by the scale iV) > Vo at which they first become isolated. 
They can be further classified by the number of particles 
N < j they actually contain on scale Vq. The probability 
that a cell first becomes isolated on scale iV)', given that it 
contains N particles on scale Vd < iV) is 


P(j,bi|A,Vo) 


p{N,Vo\j,iVj) F{j,bi) 
p{N,Vo) 


(A58) 


Recall that F{j, bi) is the probability that a randomly placed 
cell is isolated on the scale iV^' = jbi/n, so the expression 
above follows from Bayes’ rule. The region Vd is a subregion 
within the isolated region iV) . Since iV) is isolated, it can be 
thought of as a (j, bi)-halo. The subregion Vq is said to con¬ 
tain this {j, bi)-halo only if it contains the randomly chosen 
centre-of-mass particle of the halo. This happens with prob¬ 
ability Vo/iVj. Therefore, the average number of bi-haloes 
that are in cells which are not isolated on scale Vd is 


Aother nr, bi I Vd) — 

oo oo 

y]p(A,Vo) y] bi|A,Vo), (A59) 

where jmin = (m -|- 1) if m > Aoi. Otherwise, jmin = (Aoi -I- 
1). Since Vo < iV,, p{N, Vo\j, iV,) = 0 if A > j. With this in 
mind, the order of the sums can be interchanged: 

oo j 

A„ther = y] {Vo/lVi ) F{j, bi) y] p( A, Vo \j, iV, ) . 

N — 0 


The sum over A is unity, so the average number of bi-haloes 
more massive than m that are within such Vo cells is 


A„tw= y] ^F{j,b^)= Y. 

3 —Jmin 3 —Jmin 


n{jM)Vo- 


(A60) 


The final equality follows from equation (Af). 

On average, the number of bi-haloes that are more mas¬ 
sive than m, that are within randomly placed Vd cells, is 
given by adding the contribution from the two types of 
cells—those that are isolated on scale Vo and those that 
are not. Thus, when m < Npi, then the average over all Vo 
cells, A^isol “t“ Aother, is 


n(>m,bi)Vo = n{j,bi)Vo. (A61) 

j>m 
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If m > A^oi, then iViscii(> m, 6i|Vo) = 0, and the average is 
simply iVother(> m, 6i I Vo) which is the same as the expres¬ 
sion above. 

As before, define 


Ah(>m,foi|n, Vo) 


Ah(>m,bi|n, Vp) _ ^ 
n{>m,bi)Vo 


(A62) 


The cross correlation between haloes and mass, 
over all cells Vo, is 

?hm(>’Ti,6i|Vo) = (Ah(>m,foi|n, Vo) Jo) 
(A'h(>m,6i|n, Vo) Jo) 
h(>m, 6i)Vo 


averaged 


(A63) 


since So = {N — No)/No- 

For isolated cells, this average can be computed in two 
steps. The first is to average over all partitions 7r[n] of N. 
The second is to average over all values of A. If m < Noi, 
then the contribution from isolated cells is 


Nqi Nqi 

(Aisol Jo) = EE Jo Q{bi,N,Vo) Mij,h\N,bo) 

j>m N=j 
Nqi 

= ^h(j,6i)Vh Chm(i,fei|'^"o), (A64) 

j>m 


where the first equali ty ar ises from the average over partiti- 
tions of N (equation |A5®, and the second equality follows 
from using equation ( [A29D . The contribution from the other 
cells is 


^Aother Jo) — ^ ^ 
j>Noi 


V^ 

iVj 


3 

FU, 6i) ^ Jo p{N, Vo\j, iVj). (A65) 

N=0 


Since == — ^)/^o, the sum over N is 

j_Yo_ _ ^ ^ 1 - Ji ^ g 
No iVj bi 

so the contribution from these other cells is 


(Aother Jo) = Ji ^ n(j,bi)Vo. (A66) 

j>NQl 

The cross correlation function averaged over all cells is the 
sum of these two terms divided by n{>m,bi)Vo'- 


^hm(>m,6l|VV)) 


n{j,bi)Vo ^mO'ifeil^o) 
^ n{>m,bi)Vo 

j>m 

+ S3 f' . 

■4-^ n{>m,bi)Vo 

j>Noi 


(A67) 


There is no contribution from isolated cells, and the remain¬ 
ing cells yield 

Chm(>?Ti, Ji|Vb) = Ji, ifm>Aoi. 


Auto-correlations between haloes can be computed sim¬ 
ilarly. Define 

^ih(>m,6i|I/o) = {Al{>m,bi\n,Vo ))- \ .(A68) 

' ' n[>m,bi)Vo 

The second term is the shot-noise term. It accounts for the 
fact that the halo distribution is discrete. 


First consider the case when m < Noi, so isolated cells 
may contain more than one halo in the mass range of inter¬ 
est. For isolated cells, correlations arise as a result of two 
averages. The first is over all partitions of N. The second is 
over all values of N. Given a partition n of A, 


N N 

nLx = ( Um+l + • • • + Uat) -ee-«^- 

i^m j^m 


Equations (A35) and (A37) show how to compute these av¬ 
erages owr the set of partitions 7r[n]. N otice that when i = j, 
then (A37) for {rii n,), is th e same as (4.35) for (rii {rii — 1)). 
Theref ore, if we use (4.37) even when i = j, and write it 
using (4.45), then the average over A is 


.^^01 iVoi 

(A)^oi) = Q(6i, A, Vo) EE c{i,j,bi\N, bo) 

N i'>m j'>m 

Nqi 

+ y^g(&i,A,I/o)y](n,',6i|A,bo) (A69) 

N j>m 


where c{ii \N) = {ninj\N) = 0 if (i -I- j) > A. Equa¬ 
tion (A57) shows that the second term is just Aisoi(> 
m, &i|Vb). 

Cells Vo that are not isolated either contain one or no 
haloes. So, the contribu tion from these cells is just Aother(> 
m,foi|Vo) of equation (A59). The contribution from these 
cells, plus the second term from the isolated cells equals 
n{> m,bi)Vo. Together, they cancel the shot noise term in 
the definition of . The order of the sums in the remaining 
first term of (A69) can be rearranged to yield 


i+^u>mM\vo) =f: f: 


i^m j'>m 


rF{>m,b3)Vi} 


X 


^ cjij, bi\N, bp) 
^ n{i,bi)Von{j,bi)Vo 


Q{bi,N,Vo), (A70) 


where c{i,j\N) = 0 if (i -I-j) > A. If m > Api, there are 
no isolated cells which contain haloes in the mass range of 
interest. All other cells either contain one or no haloes, so, 
for these cells A^(> m, foi|Vo) = h(> m,bi)Vo- This term 
cancels the shot noise term, so that 


^hh(>«r, 6 i|Vb) =- 1 , ifm>Aoi. 


(A71) 


Comparison with equation (A.54) shows that 


■^01 -^01 


+ ^hh(>m,6i|VV)) = EE 


n{i, fei)Vo n{j, bi)Vo 

h2(>m,6i)Vj)2 


i'>m j'>m 

X [l + ^"hh(bj,6i|^o)]. (A72) 


with the convention that 1 -|- ^h(*j|0) = 0 if (i -I- j) > Api. 


A7 Clustering from white noise as a limit of the 
Poisson model 

This subsection shows explicitly that, in the limit of small 
fluctuations and large numbers of particles, all the state¬ 
ments about clustering from white noise initial conditions 
presented in the main text can be derived from the Poisson 
statements derived above by using Stirling’s approximation 
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for all the factorials, and writing all expressions to lowest 
order in 5. 

Let N = hy(l + S), and let S = (5^) = l/hF de¬ 
note the mean square fluctuation of <5 = {N — nV)lnV 
in cells of size V. Then dA^/d(5 = nV, and, when <5 <C 1, 
then use o f St irling’s approximation for the factorial reduces 
equation for p{M,V) to equation ([^. Similarly, equa¬ 
tion ( [A2| ) tends to equation (|^). Furthermore, f{M,b) 
f{S,S) |dS/dM| dM of equation ( 0 ) (e.g. Epstein 1983) and 
f{MiM\Mo,bo) ^ /(S'i,5i|S'o,5o)|dSi/dMi|dMi of equa¬ 
tion (Sheth 1995), since b = 1/(1 -|- 5) « (1 — 5), and 
1 — ( 61 / 62 ) ~ (61 — 62 )- Simple algebra shows that equa¬ 
tions (H) and (m) satisfy a recursion relation that is similar 
to the one in the discrete Poisson case (and solved in Ap¬ 
pendix]^. Namely, 

p(S'i,6i|So,6o) = / /(S',6i|S'o,6o)p(S'i,6i|S',6i)dS".(A73) 
Jso 

By considering the statistics of trajectories that are anal¬ 
ogous to those considered in the Poisson case, Bond et al. 
(1991) have shown that these expressions can be derived 
directly from the white noise field itself. 

The virtue of using the trajectory description is that it 
allows one to see the correctness of many statements that are 
otherwise tedious to compute. For example, suppose we label 
each trajectory by the value S', which is the smallest value 
of S at which it has overdensity density 57 If 6 > 6 ' >0, 
then 


piS,5)= / p{S,S\S',S')fiS',S')dS'. 


(A74) 


The left hand side of this expression is the set of all trajecto¬ 
ries that pass through S at S. The right hand side is the set 
of all trajectories that hrst pass through S' < 5 at S' < S, 
and then pass through 5 at S, summed over all S' < S, since 
trajectories that first pass through S' on scale S' > S have 
certainly not passed through 5 > 5' at S. Clearly, the left 
hand side equals the right. When S = 5', then direct substi¬ 
tution shows that this is correct. Otherwise, direct substitu¬ 
tion is not the easiest way to see that this mus t be correct. 
This equation is the analogue of equation (A 8 ). 

Notice that 


p{S,S) = f{S,S), 


d5 

and 


— p{S,S\S',S') = fiS,S\S',S'). 


(A75) 

(A76) 


These relations, with equation (A74), imply that 


dp(S',5) 

d5 


= p(S,5|5',5')/(5',5') d5' 


f{S,S\S',S')f{S',S') dS' 


= f{S,S) 


(A77) 


as required by equation ( A75 ). 

The number density of Mi haloes, that is, the un¬ 
constrained mass function, is pf (Mi, Si)/Mi which is the 
same as equation (^. Similarly, the conditional mass distri¬ 
bution is (Mo/Mi) f{Mi,Si\Mo,So) which is the same as 


equation (|^. These are the analogues of equations (A9) 
and (A16). 


The limit of equation (A20) is 


Q{bi,Mo,Vo) ^ g(5i,5o,yo) 

= piSo,So)- [ ° p{So,So\Si,Si)f(Si,Si)dSi. 

Jo 


(A78) 


If 5o > 5i, then equation (A74) shows that g = 0. When 
So < Si, then the integral above can be solved to yield 
equation (^). Bond et al. (1991) discuss Chandrasekhar’s 
derivation of q{5i,So,Vo). Their discussion of excursion set 
trajectories associated with Gaussian random fields shows, 
with no calculation, that the expression above is correct. 

The excursion set approach of Bond et al. (1991) also 
shows why equation (^^ must be correct. Consider the set of 
all excursion set trajectories, and label each trajectory by its 
value of 5(Vb) = So on scale Vo- Now, q(5i,So,Vo) gives the 
probability that such a trajectory lies below 5i for all V > 
Vo, and f{Mi,Si\Mo,So) of equation (^) gives the fraction 
of trajectories that first cross the value 5i on scale Vi, given 
that they have value So on scale Vq. Integrating the product 
of these two expressions over all 5o < 5i gives the fraction of 
trajectories that first cross the value 5i on the scale Vi, which 
is the same as equation (^). The extra factor of Mo/Mi on 
the left hand side above is pVo/Mi when written on the right 
hand side, which is consistent with equation (^. 

Expressions for the mean bias between haloes and mass 
can be obtained by taking similar limits. A little algebra 
shows that the peak background split of equation (|^) could 
have been obtai ned d irectly from the corresponding Poisson 
limit, equation ( [A26| ). 

Expressions for the cross correlation between haloes and 
mass transform similarly, as well as for the higher order mo¬ 
ments of the halo distribution all transform similarly. For 
example, equation felh co uld have been derived by taking 
the limit of equatioii(|A49|) , etc. 


APPENDIX B: SOLUTION TO THE 
RECURSION RELATION 

This App endix shows, by direct substitution, that equa¬ 
tion (A13) for F{j\k) in t he ma in text solves the recursion 
relation given i n equ ation (All). 

Equation (All) can be rearranged to read 


F{m,bi\k,b 2 )p{j, iVj\m, iVm) 


m>j 


p{j, iVi,k, 2Vk)r{S2\k,Vk) 
F{k,b2) 


- F{j,bi\k,b2). (Bl) 


Equation (A7) shows that the right hand side of this expres¬ 
sion is 

RHS = ~ ~ F{j,bi\k,h2), (B2) 

P{k, 2I4) 

where all the pin, U)s a re Po isson, so they are given by equa¬ 
tion (|A1|). If equation (A13) for E(/|fc) is correct, then 


RHS.(t-,)|M|i(h)"‘(l-,^)' , ,B3) 


k-j-l 
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Substituting equation (A12) for p{j, iVj\m, iVm) in the left 
hand side gives 

k — m — l 


( 


fc 1 - 


k\ 


k — m 


(ir( 


hi' 


&2 / 
m — j 


(B4) 


summed over all j < m < fc. This reduces to 


(!)(£- 

n N / 




N-n-1 

(B5) 


where N = {k — j — 1). Abel’s generalization of the binomial 
theorem 


(x + y)’^ = ^ 


x{x — mz)^ ^{y + mz)^ ™, (B 6 ) 


with m — N — n, X = k(h2 — bi)/b2, y = (k — j){hi/h2), and 
a = —( 61 / 62 ), reduces this to equation (B3). 

A similar recursion relation is satisfied by f{j, 6 i|fc, 62 ). 
Namely, 


yG f{m,bi\k,b 2 )pU, iVj\m, iVn, 


m>j 


pU, iVj,k, 2 Vk)ri 62 \k,Vk) 


- f{j,bi\k,b2). (B7) 


fik,b2) 

Since now trajectories are known to be centred on particles, 

P(i, iVj,k, 2Vk) = p{j - 1, iVj) p{k - j, 2Vk - iVj) 

pU, iVj) 


61 


p{k-j,2Vk-iVj). (B 8 ) 


Since -F(fe, 62 ) = 62 f(fc, 62 ) (equation AlO), the right hand 
side of equation ( |B7| ) is ( 62 / 61 ) times that in equation (Bl). 

Similarly, since now trajectories are centred on parti¬ 
cles, 


p(j, iV"j|m, iV;„) = 


P{j - 1. iVj) P{m - j, iVm - iVj) 

p{m - 1 , iVm) 


. (B9) 


This is the same as equation (A12). Therefore, if the left 
hand side of equation (B7) is to equal ( 62 / 61 ) times the 
left hand side of equation (Bl), then it must be that 
/(j, 6 i|fc, 62 ) = ( 62 / 6 1 ) F( j, 61 |fc, 62 ). This is just what is re- 
quire d by equation (A14). Thus, if f{j\k) is given by equa¬ 
tion (A14), then it satisfies the recursion relation (B7). 


APPENDIX C: AVERAGING OVER ALL 
VOLUMES 

This Appendix shows that the expressions for the condi¬ 
tional and unconditional mass functions are obtained by an 
averaging process envisaged by Bower (1991). Namely, the 
averaging is over all possible subvolumes, not necessarily 
connected, that are contained entirely within a parent vol¬ 
ume. 

Suppose space is divided up into a large number C of 
infinitesimally small cells, each of volume v. The cells are 


sufficiently small that each cell is either empty, or it contains 
one and only one particle. Suppose that there are N particles 
distributed in this space. This means that N of the C cells 
are occupied. Now choose c cells in random order without 
replacement from the total set of C cells. The probability 
that n of these c cells are occupied is 


p{n,c) = 


N{N -!)■■■ {N -n+1) 
ni ^ c{C -n + 1) 

[C - N) ■ ■ ■ {C - N - {c - n) + 1) 


{C-n)---{C -c+1) 


(^) 


(Cl) 


When C c ^ N ^ n, Stirling’s approximation for all the 
factorials except n! reduces this to 

(^) 

Now, Cv is the total volume, so {N/Cv) is the average num¬ 
ber density of particles; denote it by n. The parameter cv is 
the size of the cell made of c infinitesimal cells; set cv = V. 
Then (ciV jC) = nV and this expression is the same as equa¬ 
tion (|A1|). This shows explicitly how the Poisson distribution 
is obtained by choosing, in random order without replace¬ 
ment, a series of volume elements of the total space, and 
weighting each series of choices with the probability that it 
occurs. Since F{j, 6 ) is simply the product of p{j, Vj) with a 
quantity that depends on 6 but not V, the argument above 
applies to F{j, 6 ) also. In particular, since the volume ele¬ 
ments c are chosen at random from the full space, there is 
no requirement that they be adjacent. 


A similar argument can be used to derive equation (A2). 


Namely, suppose V 2 , containing exactly N particles is di¬ 
vided up into a large number C of small volumes v. Then, 
the probability that in c volumes, chosen randomly without 
replacement from C, there are exactly n occupied volumes, 
when it is known that there are exactly N occ upied vol¬ 
umes in C, is given by the same expression (Cl) as before. 


When C cl^ N > n, Stirling’s approximation for all the 
factorials except the term reduces this to 


(C2) 


(/) (l-/)"'". 

With cv = Vi, this is the same as equation (A2), since Cv = 
V 2 . Again, the only constraint on the volume elements c is 
that they lie entirely within V 2 . There is no requirement that 
they be adjacent. 

What remains to be shown is that F{j\k) is also ob¬ 
tained by a sampling process in which the different volume 
elements which make up iV)' are chosen randomly without 
replacement from 2 I 4 , so, in particular, they are not neces¬ 
sarily adjacent to each other. This follows from the original 
derivation, or from the fact that the derivative of p{j\k) is 
so easily related to F{j\k), or from the derivation of f{j\k) 
given in Sheth (1995). 
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